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Abstract 

We investigate the infinitely many demes limit of the genealogy of a sample of individuals from a 
subdivided population that experiences sporadic mass extinction events. By exploiting a separation 
of time scales that occurs within a class of structured population models generalizing Wright's island 
model, we show that as the number of demes tends to infinity, the limiting form of the genealogy can 
be described in terms of the alternation of instantaneous scattering phases that depend mainly on local 
demographic processes, and extended collecting phases that are dominated by global processes. When 
extinction and recolonization events are local, the genealogy is described by Kingman's coalescent, 
and the scattering phase influences only the overall rate of the process. In contrast, if the demes 
left vacant by a mass extinction event are recolonized by individuals emerging from a small number 
of demes, then the limiting genealogy is a coalescent process with simultaneous multiple mergers 
(a E-coalescent). In this case, the details of the within-deme population dynamics influence not 
only the overall rate of the coalescent process, but also the statistics of the complex mergers that 
can occur within sample genealogies. These results suggest that the combined effects of geography 
and disturbance could play an important role in producing the unusual patterns of genetic variation 
documented in some marine organisms with high fecundity. 
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Keywords: genealogy, H-coalescent, extinction/recolonization, disturbance, metapopulation, popu- 
lation genetics, separation of time scales. 



1 Introduction 

In this article, we investigate a class of population genetics models that describe a population of individuals 
subdivided into D demes which are subject to sporadic mass extinction events. In general, we will think 
of thes e demes as corr esponding to geographically distinct subpopulations such as occur in Wright's island 



model (|Wrightl . 119311 ). but this structure could also arise in other ways, such as through the association 
of homologous chromosomes within different individuals of a diploid species. Whatever the source of 
the structure, many species are subject to recurrent dis turbances wh ich, if severe enough, can result in 



the extinction of a large proportion of the population IjSousal . Il984f ) . Important sources of widespread 
disturbance include fire, severe storms, drought, volcanic eruptions, earthquakes, insect outbreaks, and 
disease epidemics. Our goal in this paper is to characterize the effects that such events have on the 
genealogy of a sample of individuals or genes collected from the entire population. Specifically, we will 
identify a set of conditions which will guarantee that in the limit of infinitely many demes, the genealogy 
of the sample converges to a process which alternates between two phases: an extended phase during 
which ancestral lineages occupy distinct demes, and an effectively instantaneous phase that begins each 
time two or more lineages are gathered into the same deme and ends when these are again scattered into 
different subpopulations through a combination of mergers and migrations. The existence of this limit is 
a consequence of the separation of time scales between demographic events occurring within individual 
demes and those affecting the global dynamics of the population. 
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This study was partly motivated by recent investigations of the population genetics of several ma- 
rine organisms whose genealogies appear to depart significantly from Kingman's coalescent (see Sec- 
tion 11.21). Based on thei r analysis of seque nce polymorphism in a popu lation of the Pacific oyster, 



Eldon and Wakelev 1 2006F ) and more recently ISargsvan and Wakelev 1 2008h suggest that the genealogies 



of some marine organisms with high fecundity and sweepstakes recruitment may be better described by a 
class of coalescent processes that generalize Kingman's coalescent by allowing for simultaneous multiple 
mergers. Indeed, in such organisms, the capacity of individuals to spawn millions of offspring makes it 
possible, in theory at least, for a substantial fraction of the population to be descended from a single 
parent. However, depending on the life history and ecology of the species in question, this could happen 
in several different ways. One possibility is that on rare occasions, individuals give birth to such a large 
number of offspring that even with random, independent survival of young, these cohorts consti tute a 
sizable proportion of the next generation. Such a scenario has been studied by ISchweinsbergl (2003), who 
showed that coalescents with multiple and simultaneous mergers arise naturally when the offspring distri- 
bution has a polynomial tail. Another possibility is that on rare occasions a small number of individuals 
could contribute disproportionately many of the surviving offspring not because they are exceptionally 
fecund, but because of mass reproductive failure or death in other parts of the population. What dis- 
tinguishes these two scenarios is whether individuals win the recruitment sweepstakes by producing an 
exceptionally large number of offspring relative to the long-term average, or by simply giving birth to an 
average (or even below-average) number of offspring at a time when most other individuals experience 
an exceptional failure of reproduction. 

The multiple mergers that occur in the models investigated in this paper arise through a combination 
of both of these factors: mass extinctions create large swathes of unoccupied territory which is then in- 
stantaneously repopulated by individuals emerging from finitely many demes. Of course, one weakness of 
this study is that we do not identify the biological mechanism responsible for restricting recolonization in 
this way, and in fact it seems difficult to formulate such a mechanism that is both realistic and consistent 
with the metapopulation models considered in this paper. However, there are several scenarios under 
which similar dynamics could arise in a spatially extended population in which disturbance events tend to 
affect contiguous demes. For example, if dispersal distances are short, then recolonization of vacant habi- 
tat in a one-dimensional population such as along a shoreline or a riparian corridor could be dominated 
by individuals recently descended from the small number of demes bordering the affected area. Similar 
reasoning might also apply to organisms with fractal- like distributions, such as aquatic or littoral species 
in estuarine environments or possibly even HIV-1 populations in the lymphatic system of an infected host. 
Alternatively, if regrowth from the margins is slow or even impossible (e.g., because surviving demes are 
separated from vacant demes by inhospitable habitat) , then a few long-distance migrants could be respon- 
sible for repopulating empty demes even in species with two-dimensional distributions. Furthermore, in 
this case, we might also predict that the number of demes contributing recolonizers would be negatively 
correlated with the fecundity of the organism, since less time would be available for additional migrants 
to enter the affected region before the first migrant propagule had completely repopulated the region. 
Although the mathematical analysis is much more challenging than that given here, spatially-explicit 
models incorporating these features are currently under development (Alison Etheridge, pers. comm.). 



1.1 Wright's island model with mass extinctions 

To motivate both the class of models studied in this paper as well as the separation of time scales 
phenomenon that leads to the infinitely-many demes limit, let us begin by considering a version of Wright's 
island model with mass extinctions. Suppose that a population of haploid organisms is subdivided into 
D demes, each of which contains N individuals. We will assume that individuals reproduce continuously, 
i.e., generations are overlapping, and that at rate 1 each individual gives birth to a single offspring which 
settles in that same deme with probability 1 — m and otherwise migrates to one of the other D — 1 
demes, chosen uniformly at random. In either case, we will assume that the deme size is constant and 
that a newborn individual immediately replaces one of the existing N members of the deme in which it 
settles. Notice that if m = 0, then this model reduces to a collection of D independent Moran models in 
populations of constant size N, whereas if N = m = 1, it describes the usual Moran model in a single 
population of size D. However, in the following discussion we will assume that m > and that D is very 
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much larger than N. 

Before we account for mass extinctions, let us consider the genealogy of a sample of n individuals 
chosen uniformly at random from the entire population. We first observe that, looking backwards in time, 
each lineage migrates out of its current deme at rate (D — l)Nm/ND rj m. Furthermore, if two lineages 
occupy different demes, then for these to coalesce, one of the two must migrate into the deme where the 
other lineage currently resides, an event that occurs approximately at rate m/D; here we have neglected 
terms of order D~ 2 and will continue to do so without further comment. When two lineages are collected 
in the same deme, then they can either coalesce immediately, which happens with probability 1/N, or 
they can cohabit within that deme for some random period of time until either they coalesce or they 
migrate into different demes. Since two lineages occupying the same deme coalesce at rate 2(1 — m)/N, 
and each lineage, independently of the other, migrates out of the deme at rate m, the probability that the 
two coalesce rather than migrate is x = (1 — m)/(l — m + Nm). Putting these observations together, it 
follows that every time two lineages are collected within the same deme by migration, the total probability 
that they coalesce rather than migrate into different demes is 1/N + (1 — 1/N)x = 1/(1 — to + Nm), and 
the time that elapses between entry into the same deme and either coalescence or escape is a mixture of a 
point mass at (in case they coalesce at the entry time) and an exponential random variable with mean 
N/(2mN + 2(1 — to)). In particular, notice that typically much less time is required for two lineages 
occupying the same deme to either coalesce or escape (of order N) than is needed for two lineages 
occupying different demes to be collected into the same deme (of order D) . It is this disparity between 
the rate of events happening within individual demes and the rate at which lineages are gathered together 
that gives rise to a separation of time scales in the island model. If we rescale time by a factor of D and 
let the number of demes tend to infinity, then the time required for two lineages sampled from different 
demes to coalesce is exponentially distributed with mean (1 — m + Nm)/2m. 

To complete our description of the coalescent process in this model, we need to consider the possibility 
of more complex coalescent events. We first observe that if n individuals are sampled from D demes, 
then the probability that all of these individuals reside in different demes will be close to one if D is 
much greater than n. Furthermore, because lineages occupying different demes coalesce and migrate 
independently of one another, it is straightforward to show that the probability that three or more 
lineages ancestral to our sample are collected into the same deme is of order D~ 2 or smaller. Likewise, 
it can be shown that the probability of having multiple pairs of lineages collected into several demes at 
the same time is similarly negligible. From these observations, it follows that only pairwise coalescence 
events matter in the infinitely-many demes limit, and that if there are n ancestral lineages, then at 
rate (™)2to/(1 — to + Nm), two of these, chosen uniformly at random, coalesce, leaving n — 1 ancestral 
lineages. In other words, the genealogy for this model can be approximated by a scalar time change of 
Kingman's coalesce nt, with a rate that depends o n both the migration rate and the deme size. This result 
is essentially due to IWakelev and Aliacarl (|200lh . who considered a similar model with non-overlapping 
generations and Wright-Fisher sampling. 

Now let us introduce mass extinction events into this model. Fix e > and y G [0, 1), and suppose that 
at rate e/D, the metapopulation suffers a disturbance which causes each deme to go extinct, independently 
of all others, with probability y. For example, we could consider a model in which the demes represent 
small islands or keys in the Caribbean and the disturbances are hurricanes that completely inundate 
those islands lying in their path. Here we are reverting to the original time units, i.e., time has not yet 
been rescaled by a factor of D, and we have chosen the disturbance rate so that mass extinctions occur 
at rates commensurate with coalescence in the pure island model. In keeping with the assumption that 
deme size is constant, we will assume that all of the islands that are left vacant by a mass extinction 
are immediately recolonized by offspring dispersing out of a single source deme that is chosen uniformly 
at random from among the demes unaffected by the disturbance. In addition, we will assume that the 
parent of each colonizing individual is chosen uniformly at random from the N members of the source 
deme. Of course, the entire metapopulation could be extirpated by a mass extinction if y > 0, but the 
probability of this outcome is exponentially small in D and can be disregarded as D tends to infinity. 

Suppose that a mass extinction occurs at a time when there are n ancestral lineages occupying distinct 
demes. Bearing in mind that we are now looking backwards in time, all of the lineages belonging to demes 
that are affected by the disturbance will move into the source deme, where those sharing the same parent 
will immediately coalesce. Thus, one reason that multiple mergers can occur in this model is because of 
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the very highly skewed distribution of recolonizing offspring contributed both by individuals and denies 
following a mass extinction. Suppose that there are n\ distinct lineages remaining in the source deme 
once we account for this initial set of coalescences. These lineages will undergo a random sequence of 
migration and coalescence events until there is only one lineage remaining within the source deme. For 
example, if n\ — 4, then one possible outcome would see one lineage migrate out of the deme followed by 
a pair of binary mergers, leaving only one lineage within the source deme. Whatever the sequence, the 
amount of time required to scatter the lineages into different demes will be of order 0(1), whereas the 
time until either the next mass extinction event or the next binary merger involving lineages outside of 
the source deme will be of order 0(D). Thus, if we again rescale time by a factor of D, then any sequence 
of coalescence and migration events involving a source deme will effectively be instantaneous when we 
let D tend to infinity. This is the second way in which multiple merger events can arise in this model. 
Furthermore, varying the migration rate and deme size changes not only the overall rate of coalescence, 
but also the relative rates of the different kinds of multiple merger events that can occur. For example, if 
Nm is very small, then the coalescent process will be close to a A-coalescent (which has multiple mergers, 
but not simultaneous multiple mergers) because most lineages that are collected into a source deme by a 
mass extinction event will coalesce before any escape by migration. However, as N increases, so will the 
probability that multiple lineages enter into and then escape from the source deme without coalescing. 
This suggests that at moderate values of Nm, mass extinctions may be likely to result in simultaneous 
mergers (i.e., the coalescent is a S-coalescent), while for very large values of Nm, multiple mergers of all 
types will be unlikely and the coalescent process will tend towards Kingman's coalescent. 



1.2 Neutral genealogies and coalescents 

In the last twenty years, coalescent processes have taken on increasingly important role in both theoretical 
and applied population genetics, where their relationship to genealogical trees has made them powerful 
tools to study the evolution of genetic diversity within a population. Under the assumption of neutrality, 
allelic types do not influence the reproduction of individuals and it is therefore possible to separate 
'type' and 'descent'. This allows us to study the genealogy of a sample of individuals on its own and 
then superimpose a mechanism describing how types are transmitted from parent to offspring, justifying 
the interest in investigating genealogical processe s corresponding t o particular reproduction mechanisms 



without explicit mention of types. We refer to iNordborgl l|2Q0lh for a review of coalescent theory in 
population genetics. 



Beginning with the coalescent process introduced by iKingma n (1982) to model the genealogy of a 



sample of individuals from a large population, three increasingly general classes of coalescent processes 
have been described. A key feature shared by all three classes is the following consistency property: the 
process induced on the set of all partitions of {1, ... , n} by the coalescent acting on the partitions of 
{1, . . . , n + k} (obtained by considering only the blocks containing elements of {1, ... ,n}) has the same 
law as the coalescent acting on the partitions of {1, . . . , n}. In terms of genealogies, this property means 
that the genealogy of n individuals does not depend on the size of the sample that contains them. To 
describe these continuous-time Markov processes, it will be convenient to introduce some notation. For 
all n £ N, we denote the set of all partitions of [n] = {1, . . . , n} by P n . In the following, the index n 
of the set of partitions in which we are working will be referred to as the sample size, an element of 
{1, . . . , n} will be called an individual, and 'block' or 'lineage' will be equivalent terminology to refer 
to an equivalence class. If £ 6 Un,P«> then |£| = k means that the partition £ has k blocks. Also, for 
C, r) G P„ and ki, ■ . ■ , k r > 2, we will write r\ Ck lt ...,k r C if V ls obtained from £ by merging exactly k\ 
blocks of £ into one block, hi into another block, and so on. Kingman's coalescent is defined on P„ for 
all n > 1, as a Markov process with the following Q-matrix: if £, rj 6 P„, 

if n C 2 £, 

<li<W - </) = { -(h ) if V = C, 

otherwise. 

A more ge neral cla s s of ex cha ngeable coalesc ents, allowing mergers of more than two blocks at a time, was 
studied bv lPitman ljl999h and lSagitovl l|l999l ). These coalescents with multiple mergers (or A-coalescents) 




are in one-to-one correspondence with the finite measures on [0, 1] in the following manner: for a given 
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coalescent, there exists a unique finite measure A on [0, 1] such that the entries 9a(C rj) of the Q-matrix 
of the coalescent, for £, r\ e P„, are given by 



/„* A(rfa;)x' c - 2 (1 - a;) b - fc if i] C k C and |(| = &, 

9a(C ^7) = <j -/ 1 A(da;)a;- 2 (l-(l-a;) 6 - 1 (l-a; + 6a;)) if 77 = C and |C| = b, 
otherwise. 

Kingman's coalescent is recovere d by taking A = So the po int mass at 0. Last l y, a t hird and wider class 
of coalescents was introduced by lMohle and Sagitov ( 2001 ) and Schweinsbergl 1 2000l ). for which mergers 
involving more than one ancestor are allowed. These co alescents with simultaneous multiple mergers (or 
S-coalescents) are characterized in ISchweinsberg ( 2000h by a finite Borel measure on the infinite ordered 
simplex 



|(xi,a;2, . . .) 



: x\ > X2 > 



>0, 



=1 



Xi < 1 



Indeed, to each coalescent corresponds a unique finite measure 5 on A of the form 5 = So + aSo, where 
S has no atom at zero and a € [0, 00), such that the transition rates of the coalescent acting on P„ are 
given by 



<7h(C V) 



^o(rfx) 
i=i x . 



E 



E 

i 1 ^...^i r+ i 



k r 
. X- Xi 



00 



+ a E{ T ~i,fc 1= 2} 



if T] Cfc ll ...,fe r C and s = \Q — The other rates (for 7/ ^ () are equal to zero. The A-coalescents 

are particular cases of S-coalescents, for which E(x 2 > 0) = 0. 

As mentioned above, coalescent processes can be used to describe the genealogy of large populations. 
Indeed, a large body of literature has been devoted to describing conditions on the demography of 
a population of finite size N that guarantee that the genealogical process of a sample of individuals 
converges to a coalescent as N tends to infinity . Such limiting results for populations with discrete non- 
overlapping g e nerat io ns are reviewed in [M ohle (<2000l l and some examples can be found for instance in 
Schweinsberg 1 2003t ). Eldon and Wakelevl (|2006h and Sargsvan and Wakelev 1 2008h . In these examples, 
the shape of the limiting coalescent is related to the propensity of individuals to produce a non-negligible 
fraction of the population in the next generation. 

However, the representation of the genealogy as a coalescent requires in particular that any pair of 
lineages has the same chance to coalesce. This condition breaks down when the population is structured 
into subpopulations, since then coalescence will occur disproportionately often between lineages belonging 
to the same deme. To model these kinds of scenarios, struct ured analogues of coalescent processes were 
introduced (see e.g. Notohara . 1990l : Wilkinson-Herbota . 1 1998t ) . which allow lineages both to move between 
demes as well as coalesce within demes. Various state spaces have been used to describe a structured 
coalescent, such as vectors in which the i'th component gives the lineages (or their number) present in 
deme i, or vectors of pairs 'block x deme label'. All these representations of a structured genealogy take 
into account the fact that the reproductive or dispersal dynamics may differ between demes, hence the 
need to keep track of the location of the lineages. In contrast, several papers investigate models where the 
structure of the genealogy collapses on an appropria te time scal e, i.e., the limiting genealogy no longer 
sees the geographical division of the population. In ICoxl l|l989l ). demes are located at the sites of the 
torus T(D) C Z d of size D and each site can contain at most one lineage. Lineages move between sites 
according to a simple random walk, and when one of them lands on a site already occupied, it merges 
instantaneously with the inhabitant of this 'deme'. These coalescing random walks, dual to the voter 
model on the torus, are proved to converge to Kingman's coalescent as D — > 00. More precisely, Cox 
shows that if n < 00 lineages start from n sites independently and uniformly distributed over T(D), then 
the process counting the number of distinct lineages converges to the pure deat h process that de s cribes 
the number of li n eages in Kingman's coalescent. This analysis is generalized in ICox and Durrettl \200± 



and IZahle et al.l (|2005) , where each site of the torus now contains N G N individuals and a Moran- 
type reproduction dynamics occurs within each deme. Again, the limiting genealogy of a finite number 
of particles sampled at distant sites is given by Kingman's coalescent, and convergence is in the same 
sense as for Cox' result. Other studies of systems of particles moving between discrete subpopulations 
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and c oalescing; do not require that the initial locations of the lineages be thinned out. In iGreven et al.1 
(|2007n . demes are distributed over the grid 1? and the process starts with a Poisson-distributed number 
of lineages on each site of a large box of size D a / 2 , for some a G (0, 1]. The authors show that the total 
number of lineages alive at times of the form D l converges in distribution as a process (inde xed by t > a 



to a t ime-change of the block counting process of Kingman's coalescent, as D — > oo. See IGreven et al 
(|2007n for many other references related to these ideas. 

Our emphasis in this paper will be on the separation of time scales phenomenon and the way in which 
local and global demographic processes jointly determine the statistics of the limiting coalescent process. 
Consequently, we shall always assume that the demes comprising our population are exchangeable, i.e., 
the same demographic processes operate within each deme, and migrants are equally likely to come from 
any one of the D demes. In this simplified setting, we only need to know how lineages are grouped into 
demes, but not the labels of these demes. 



1.3 Separation of time scales 

A separation of time scales can be said to occur whenever different components of a stochastic process 
evolve at rates which greatly differ in their magnitudes. This concept is usually invoked when there is 
a sequence of stochastic processes (X^t > 0) on a space E as well as a function r\ : E — > E' and an 
increasing sequence t\d — » oo such that the processes {X D -y ,t > 0) have a non-trivial limit (X t fast , t > 0) 

r D t 

determined by the fast time scale, while the processes (i](XP),t > 0) (which are only weakly influenced 
by the fast evolution) have another non-trivial limit (Xf° w ,t > 0) determined by the slow time scale. 
It is often the case that the processes (Xf,t > 0) have the Markov property but do not converge to a 
limit, while the slow processes (r/(A t D ),i > 0) do converge, but are not Markovian. 



Separation of time scales techniques were first introduced into population genetics bv lEthier and Nagylak 

(1980), and since then ha ve been used to study the ge nealogical processes of structured populations in 



several different settings. iNordborg and Krone! 12 002) consider a population of total size N, evolving 
according to a Wright-Fisher model (see Fisher . 193dl : IWright . 1931 ) and distributed over D < oo demes. 
These demes are in turn structured into groups of demes, within which individuals migrate faster (at a 
rate of order N~ a for an a G [0, 1]) than from one group to another (which occurs at a rate 0(iV _1 )). 
When all demes are connected by fast migration, they show that structured genealogy collapses to an un- 
structured Kingman's coalescent as N tends to infinity, due to the fact that migration is so fast compared 
to the coalescence rate (of order TV -1 ) that the population becomes well-mixed before the first coalescence 
event occurs. When several groups of demes are connected by slow migration, the genealogical process 
converges to a structured coalescent, in which groups of demes act as panmictic populations and coales- 
cence of lineages within a group is faster than between two groups. These results are made possible by 
the fact that the blocks of the partition induced by the genealogy are not affected by a migration event. 
Since only migration occurs on the fast time scale and coalescence is on the slow time scale, forgetting 
about the location of the lineages gives a sequence of (non-Markov) processes which converge on the slow 
time scale to a Markov process. 

Another kind of separa t ion of time scales was studied by Wakeley and co-authors in a series of papers 
(see in particular Wakeley . 19981 . 19991 2004t Wakeley and Aliacar . 2001 ). In these models, a population 
evolving in discrete non-overlapping generations occupies D demes, labeled 1, ... ,D. Deme i contains 
a population of Ni adults and receives M, migrants each generation. Then, a Wright-Fisher resampling 
within each deme brings the population sizes back to their initial values. Other mechanisms can also 
be taken into account, such as extinction of a group of demes followed by instantaneous recolonization. 
Allowing D to tend to infinity greatly simplifies the analysis of the genealogical processes, and in particular- 
gives rise to a decomposition of the genealogy of a sample of in dividuals into tw o different phases, occurring 
on two time scales. Following the terminology introduced in IWakelev 1 19991 ) ■ the first phase to occur is 
the scattering phase, in which lineages occupying the same deme coalesce or move to empty demes 
('empty' meaning that none of the sampled lineages are in this deme). In the limit, this phase occurs 
on the fast time scale and is therefore viewed as instantaneous. At the end of the scattering phase, all 
remaining lineages lie in different demes. The collecting phase is the following period of time during 
which lineages are gathered together into the same demes by migration or extinction/recolonization, 
where they may merge. The limiting genealogical process is a coalescent on the slow time scale, which 



6 



ends when the number of lineages reaches one. 

As we have already mentioned, apart from an initial instantaneous burst of mergers (which only 
occurs if multiple individuals are sampled from the same deme) all of the genealogical processes obtained 
in this setting are scalar time changes of Kingman's coalescent. Indeed, in the forwards in time evolution, 
migrants and colonizers are assumed to come from the whole population or from a non-vanishing fraction 
of the demes and so, with probability one, only two of the finitely many lineages of the sample are 
brought into the same deme at a time in the limit. Subsequently, the two lineages either coalesce or 
are scattered again, but in any case the outcome is at most a binary merger. In this paper, we shall 
study coalescent processes that arise in population models which include mass extinctions and general 
recolonization mechanisms, and describe the conditions in which it corresponds to an unstructured H- 
coalescent on the slow time scale. To this end, we will speak of 'scattering' and 'collecting' phases in 
a more general sense. We prefer to call the 'collecting phase' the period of time during which lineages 
wander among empty demes until a migration or extinction event brings several lineages into the same 
deme. We shall show that, once such a 'geographical collision' has occurred, an instantaneous scattering 
phase follows at the end of which all lineages have merged or moved to empty demes. Another collecting 
phase then starts and so on until the most recent common ancestor of the sample has been reached and 
there is only one lineage remaining. 

1.4 Framework and main results 

Fix ncN and consider the genealogy of a sample of n individuals from a population of D > n demes (the 
following framework also allows D — oo). In the following, we shall suppose that demes are exchangeable 
in the sense given in Section fOl We shall work in the space defined as follows: 

Definition 1. LetP^ be the set 

P s n = {({B u . . . ,B h }, . . . ,{B in _ 1+1 , . . . ,B in }) : 0< i x < ... < i n < n, 
+ Bj c [n] Vj e {1, . . . , i n ], {B u ...,B in }e P„} 

of n-tuples of sets (we allow some of the components of the n-tuple to be empty), and let us define 
the equivalence relation ~ on by £ ~ £' if and only if there exists a permutation a of [n] such 
that, if Bi = {Bi, . . . , Bi ± }, . . . , B n = {Bi nl+ \, . . . , Bi n } are the components of the vector £, then 
£' = [B a m, . . . ,Be(n\). The quotient ofV s n by ~ is denoted by P*. 

We call any ({-Bi, . . . , B^}, . . . , {Bi . . . ,Bi }) £ an unordered structured partition of 

[«]• 

In view of the application we have in mind, each component Bj represents a particular deme containing 
some of the lineages ancestral to the sample, and the blocks Bk (for k 6 {1, . . . , i n }) specify the partition 
of the sample determined by the ancestors alive at a particular time. Empty components are used to 
guarantee a constant vector size, n, independent of the index D used later. In the following, we omit the 
term 'unordered' when referring to the structured partitions of Definition [TJ 

The finite set P* is endowed with the discrete topology, which is equivalent to the quotient by ~ of 
the discrete topology on P*. 

Definition 2. A Markov process V on P* for which blocks can only merge and change component is 
called a structured genealogical process. 

To illustrate the possible transitions, let us take n = 5 and consider the following sequence of events: 

({{1}}, {{2}}, {{3}}, {{4}}, {{5}}) $ ({{1}, {2}}, {{3, 4}}, {{5}}, 0, 0) 

- ({{1}},{{2}},{{3,4,5}},0,0) 

^ ({{1,2, 3, 4, 5}}, 0,0, 0,0). 

In this example, we start from the configuration in P| where each lineage is alone in its deme. During 
transition {%), either {1} or {2} changes component and both blocks end up in the same deme (which 
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creates an empty component in our representation), but remain distinct. In contrast, either {3} or {4} 
also moves (emptying another component), but then the two blocks merge into a single block {3,4} 
which is not allowed to split during later transitions. Block {5} remains alone in its component. During 
transition (ii), lineages {1} and {2} are scattered again into two different demes by the movement of one 
of them, while one of the lineages {3,4} or {5} changes component and the two blocks merge. Eventually, 
all the remaining blocks are gathered into the same deme and merge into a single block. Since elements 
of P| are defined up to a permutation of their components and since a block is not allowed to split, no 
other change is possible from the state reached after transition (Hi). 

Remark 1.1. Movements and mergers of blocks do not alter the sample size. However, this does not 
guarantee that the structured genealogies are consistent in the sense given in Section \1.2\ as we would 
expect from a reasonable genealogical process. In fact, several conditions will be imposed on the models we 
consider so that this property holds: see Lemma \2.2\ for the consistency of the fast genealogical process, 
and the set of conditions imposed on the geographical gatherings in Proposition \2.1\ Proposition 
\4-l\ states in particular that the latter conditions are necessary and sufficient for the genealogies to be 
consistent on both time scales and that when they are fulfilled, the unstructured genealogical process on 
the slow time scale is a "E-coalescent. 

Let us order the components of a given structured partition by the smallest element belonging to a 
block contained in the component (if it is non-empty). Empty components come last. For each k < n and 
C € P^, let us write \(\ a = k if the a'th component (in the order just defined) of the structured partition 
C contains k blocks, and define a subset II ra of P* by 

n„ = {c G P* : ICU < 1 Vae{l,...,n}}. (1) 

n„ is the set of all structured partitions of [n] in which each deme contains at most one lineage. These 
sets will appear naturally in the description of the limiting processes. 

Recall from the example given in Section fTTTI that the rate at which lineages are collected together 
in the same deme is much smaller than the rate at which lineages already occupying the same deme 
either coalesce or are scattered into different demes. Furthermore, as in that example, we will continue 
to assume that catastrophic extinction-recolonization events occur rarely, in fact, at rates that are of the 
same order of magnitude as the rate at which lineages occupying different demes are brought together 
by ordinary migration. With these points in mind, let us consider a sequence (V®, s > 0) of structured 
genealogical processes for a finite sample from the whole population, which consists of the following kinds 
of events: 

• within-deme coalescence and movement of lineages to empty demes at rates of order O(l); 

• movement of groups of lineages initially occupying different demes into the same deme, possibly 
followed by mergers of some of these lineages, at rates of order (^(r^ 1 ). 

Let us rescale time by a factor of ru so that the coalescence rate of two individuals in different demes 
is of order 0(1) as D tends to infinity. Of course, within-deme coalescence and migration now occur at 
increasing rates of order 0(r£>). This implies that, for a given sample size n, the generator G D of the 
genealogical process acting on P* has the form 

G D =r D V + r + R D , 

where 9,F and Rd are bounded linear operators, (Rd) — > as D — > oo, and we do not record the 
dependence of the operators on the sample size n. Here, if || • || stands for the supremum norm on the 
space of functions / : P* — * R, then (R) is defined by 

<i?>= sup ra (2) 

J¥o 11/11 

Because ro — > oo, the sequence (G d )d>i is unbounded, even when applied to functions of the 
unstructured partition induced by V D , and so we do not expect the structured coalescent processes 
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corresponding to these generators to converge pathwise. Nevertheless, our heuristic description of the 
fast dynamics suggests that elements of H n will be unaffected by the 'fast' events corresponding to ^, 
which will indeed be the case under the assumptions made in Section [2l Furthermore, we will show (cf. 
Lemma T2.ip that the process generated by W on and starting at £ 6 P* a.s. reaches a random final 
state C in n n in a finite number of steps. Since the rates of the events generated by \P grow to infinity, 
increasing numbers of these events take place before the first event corresponding to V even occurs. This 
motivates the description of the genealogy given above in terms of an alternation of very short scattering 
phases driven by \& and of longer collecting phases ending with the first event generated by T at which 
V D leaves n„. Viewing all of the transitions occurring during a given scattering phase as a single, more 
complex event, and exploiting the fact that these phases are vanishingly short, it is plausible that there 
is a genealogical process V with values in II„ such that for each fixed time t > 0, Vf => Vt as D — > oo. 
Our main result makes these heuristic arguments rigorous: 

Theorem 1.1. Let C G Pf t - Under the conditions described in Section the finite- dimensional distri- 
butions of the structured genealogical process V D starting at ( converge to those of a Tl n -valued Markov 
process V starting at £, except at time 0. 

The proof that V D converges in law to V in the Skorokhod space -Dps [0, oo) of all cadlag paths 
with values in P^ requires tightness of the corresponding sequence of distributions. We shall show in 
Proposition 13.21 that this property holds if and only if the rate at which the genealogical process leaves 
the set n„ tends to zero as D grows to infinity. Indeed, if this condition is not satisfied, then two or more 
jumps can accumulate during a scattering phase: the jump out of II„ followed by the events needed to 
bring V D back into H n - Fortunately, the proof that the unstructured genealogical processes are tight is 
less demanding, since these processes do not change state when lineages move between demes. In this 
case, an accumulation of jumps due to the fast within-deme dynamics will be ruled out if we can show 
that the probability that the process V D re-enters H n in a single jump converges to one as D tends to 
infinity. 

The limiting process V with values in U n is introduced and investigated in Sectional and we show 
in Proposition 12.11 that, under the assumptions of Theorem 1.1, the unstructured genealogical process 
induced by V is the restriction to P n of a S-coalescent. We also identify the limiting process £ for the 
genealogy on the fast time scale in Section[2l and state in Proposition [3J] the convergence of V B -x to £ as 

r D ' 

processes with values in -Dps [0, oo). The proofs of Theorem II .11 and Proposition 13.11 are given in Section 
[21 along with a discussion of the tightness of V D . Although the conditions of Theorem II .11 are somewhat 
contrived, we show in Section |4] that these are necessary and sufficient for the unstructured genealogical 
process of a generalized island model to converge to a S-coalescent on the slow time scale. In Section 
[U we apply these results to a particular class of models incorporating mass extinction events. Based 
on our analysis of this class, we suggest that families of S-coalescents may often interpolate between 
A-coalescents and Kingman's coalescent in structured population models, and that it may be a generic 
property of such models that they admit simultaneous mergers whenever they admit multiple mergers. 



2 Construction of the limiting genealogical processes 
2.1 A generalized Island-Cannings Model 

To motivate the genealogical processes considered in this paper, we begin by introducing a gen eral model 
for th e demography of a subdivided population which combine s features of the Cannings model 1 Cannings! . 
1974h with those of the classical Island model ijWrightl . 119311 1 . 



Suppose that the population is subdivided into D demes, each of which contains N haploid individuals. 
Islands are labeled 1, . . . , D, while individuals within each island are labeled 1, . . . , N. At rate 1, an ND 2 - 
dimensional random vector R = (i?^ 3 , i, j 6 {1, . . . , -D}, k € {1, . . . , N}) is chosen, such that for all i, j, fc, 
R^ J is the number of descendants of the k'th individual in deme j which settle into deme i during the 
event. In keeping with the spirit of the Cannings' model, we use the term 'descendant' both to refer to 
the offspring of reproducing individuals as well as to individuals which were alive both before and after 
the event (as in Cannings' formulation of the Moran model). We impose the following conditions on the 
random variables R]f : 
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1. Constant deme size: With probability 1, for all i G [D] we have £\ fc R k 3 = N - 

2. Exchangeable dynamics: The law of R is invariant under any permutation a of [D] 2 x [N] such 
that for every i G [D], cr(i, i, k)i = cr(i, i, k)2, i.e., a conserves the relation source deme = destination 
deme. (Here, a(i,j,k)i denotes the Z'th component of the permuted vector.) 

Then, in each deme the current population is replaced by the N offspring coming into this deme during 
the event, which we label in an exchangeable manner. 

Let us comment on the above conditions. The first one simply guarantees that the number of individ- 
uals in each deme is constant and equal to N. For the second condition, let us first fix i, j and a permu- 
tation r of [N], and look at the permutation a given by cr(i,j, k) — r(fc)) and a(i',f, k') = (i',f, k') 
whenever i ^ i' or j ^ f . Then, condition [2] corresponds to the exchangeability of the contribution of 
the inhabitants of deme j in repopulating deme i. Second, fix % and choose a permutation r of [D] \ {i}. 
Set a(i,j,k) = {i,r{j),k) if j ^ i, a(i,i,k) = (i,i,k) and a(i',j',k') = (i',j',k') whenever i' ^ i. In this 
case, condition [2] states that the demes different from deme i contribute in an exchangeable manner to 
the repopulation of deme i. Finally, let r be a permutation of [D] and define a(i,j,k) = (r(i),j, k) if 
j ^ {i,r(i)}, cr(i,j,k) — (r(i), r(i), I) if j = i, and a(i,j,k) = (r(i),i,l) if j — r{i). For such permuta- 
tions, condition [2] asserts that the dispersal mechanism is exchangeable with respect to the destination 
of dispersing individuals (provided that this differs from the source deme). Overall, our assumptions 
aim at making the dynamics depend on the labels as weakly as possible, but we allow the repopulation 
mechanism of a deme to differ according to whether the new individuals are produced within this deme 
or come from one of the D — 1 other demes. 

Example 1. If R is invariant under all permutations a of [D] 2 x [N] (not just those satisfying condition 
U|), then the dynamics are those of a Cannings' model for a panmictic population of size DN, i.e., there 
is no population subdivision. 

Example 2. // all demes evolve independently of each other, then = whenever j ^ i. Condition® 
imposes that (R hl ,i G [D]) should be an exchangeable D-tuple of exchangeable N -tuples, a situation 
corresponding to a continuous-time Cannings model acting within each deme. 

Example 3. Let m 6 [0, 1] and assume that, with probability 1 — m, R is chosen as in Example [U 
With probability m, four numbers i,j,l,k are sampled uniformly at random in [D] 2 x [N] 2 , and the k'th 
individual in deme j produces an offspring that replaces the I'th individual in deme i. In this case, 
R 1 * 1 = (1, . . . , 0, . . . , 1), where the unique zero is in the I 'th coordinate; R 1 ^ = (0, . . . , 1, . . . , 0), where the 
unique 1 is in the k'th coordinate; R 1 ' 3 = (0, . . . , 0) if f {i,j} and for i' ^ i, R 1 J = (1, . . . , 1) if 
i' = j' and (0, ... ,0) otherwise. This model gives a simple example including within-deme reproduction 
and individual migration. Alternatively, individuals could be exchanged between demes during a migration 
event, in which case a descendant of individual I in deme i (in the above notation) also replaces individual 
k in deme j . 

Example 4. An event during which one deme goes extinct and is recolonized by the offspring of indi- 
viduals coming from other demes has the following formulation: R 1 ' 1 — (0, . . . , 0) if deme i goes extinct, 
R 1 ' 1 = (1, . . . , 1) if I ^ i and the repopulation of deme i satisfies the exchangeability condition® For 
instance, N individuals are chosen uniformly at random among the N(D — 1) inhabitants of the other 
demes and contribute one offspring in the new population of deme i. 

Many other kinds of events can be imagined, but these three mechanisms (reproduction, migration 
and extinction/recolonization) will be the building blocks of the models we shall consider in this paper. 
Viewed backwards in time, reproduction events as in Example [2] will correspond to the merger of several 
lineages if they are produced (forwards in time) by the same individual during the event considered. A 
migration event such as in Example [3] will correspond to the movement of one or a few lineages from their 
demes to other subpopulations, if these lineages happen to have their parents in the source demes. An 
extinction event will also typically result in the movement of lineages among demes, and could involve 
much larger numbers of individuals or demes than simple migration events. Note that lineages can both 
move and merge during the same event, if their common parent lies in a different deme. 
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2.2 Genealogy on the fast time scale 

Let us start by constructing a structured genealogical process (£t,t > 0) such that its restriction to P^ 
describes the genealogy of n individuals on the fast time scale of individual demes. This process will 
incorporate mergers of lineages occupying the same deme as well as dispersal of lineages into empty 
demes (i.e., those not containing other ancestral lineages), but no events where geographically separated 
lineages end up in identical demes and possibly merge. In fact, if the rate at which such events occur is 
very large, then it is not difficult to see that the structure of the population effectively disappears on the 
fast time scale and the model collapses to that of a panmictic population. We thus rule out this kind of 
situations to keep a structured population. 

We construct the process £ by specifying its restriction to P*. As P^ is a finite set, we can define 
a continuous-time Markov process on this space by specifying its transition rates. Because a block 
represents a single ancestor, whose descendance at time is made of the individuals contained in the 
block, we shall ask that the rates at which blocks move and merge do not depend on the number or 
labels of these individuals. Hence, these rates will only depend on the collection {fci, . . . , k p , 0, . . . , 0} 
giving the numbers of blocks contained in the different components of £. In order to describe the possible 
transitions, we need the following definition. 

Definition 3. Let k = {ki, . . . , k p } and k' — {k[, . . . , k' q } be two collections of (non-zero) integers. We 

shall write k\> k' if q > p, J2i=i K — &j) an d we can arrange the elements of k' so that for each 

i € {1, . . . ,p) , we have 1 < k[ < ki and at least one of such k[ is strictly less than ki. 

Note that no collection k of integers satisfies {1, . . . , 1} > k. 

For all pairs (k, k') such that k > k' , let y € R + . In addition, if £ £ P«, let £;(£) be the collection of 
integers which gives the number of blocks within each non-empty component of C- Define the infinitesimal 
rate q^)(i]\C) of a particular transition £ — > V (when r\ ^ £ and both belong to P*) as: 

• 9(C) (^IC) — ^fc(f) k(r))> if 7 1 can ^ e obtained from £ by first merging some number (possibly zero) 
of blocks contained in the same component of £, and then moving some blocks to formerly empty 
demes with the restriction that only blocks originating from the same deme can be gathered into 
the same destination deme (again, we allow the number of blocks moved to be zero). In this case, 
we easily see that we must have fc(£) > k(rj). 

• q^){r/\C) — otherwise. 

In the following, we shall assume that for any ( e containing more than one block in at least one 
component, the rates satisfy the condition 

E 9«)falC)>0. 

These conditions ensure that, whenever a deme contains more than one lineage, a scattering or a coales- 
cence event will happen in the future with probability one. Recall the definition of II n given in ([!]). From 
the form of the rates given above, we see that any r\ 6 II„ is an absorbing state for £. Moreover, we have 
the following result, saying in essence that the process £ with values in P^ reaches a final state in a finite 
number of steps, and this final state is a random variable with values in II ra . 

Lemma 2.1. Let tv be the stopping time defined by tv = inf{t > : (( G n„}. Then, tv is a.s. finite 
and for all t > , £ t = £ T7r . 

Proof. From our assumptions on the rates, the only absorbing states of the process £ are the structured 
partitions contained in H n . Moreover, any transition results in a coarsening of the corresponding un- 
structured partition or in the movement of some lineages to different empty demes, so the number of 
transitions for £, starting at any £o S ?n is bounded by n. Since the time between two events is expo- 
nentially distributed with a non-zero parameter (the sum of the rates of the possible transitions) as long 
as the process has not reached an absorbing state, the finiteness of the number of transitions undergone 
by £ imposes that is a.s. finite. □ 
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Let us introduce the following notation, justified by the result of Lemma |2~T1 

Notation 1. If ( G P*, let £ denote a random variable with values in Ii n , whose distribution is that of 
the final state of the structured genealogical process £ started at Q. 

We end this subsection with the following lemma, whose main purpose is to introduce the notion of 
consistency for structured genealogical processes. If C € P£ and £ 6 P£ +1 for some k £ N, let us write 
C -< C if the projection of C onto P| (the fc-tuple describing the structured partition of 1, . . . , k) equals £. 

Lemma 2.2. Suppose that £ is defined on Y s k for every k £ N. The following conditions are equivalent: 

(i) For each k > 1, Q, r) 6 P| and ( G Pfc+i such that C -< C> 

v 

where the sum is over all fj £ Pj. +1 suc h that rj -< fj. 

(ii) The process £ is consistent in the sense that, for all k > 1, if Q G P| and (' G P^. +1 satisfy ( -< 
i/ien £/ie /aw o/ the restriction to P| o/ i/ie process £ started at £' is £/ie same as the law of £ started at £■ 

Zn particular, if both conditions are fulfilled and if r] £llk /ias r blocks, then 

p [c=v] =E p [c:=^' ) ]» 

3=1 

where for each j G {1, . . . , r}, 17 g Ilfc+i is obtained from r\ by adding an empty (k + l)-st component to 
r\ (which turns it into an element rf ofUk+i), and adding k + 1 in the j'th block of rj' . Likewise, r]( r+1 ^ 
is obtained by adding the singleton {k + 1} in an empty component of rf . 

Proof. Let (resp. £(fc+i)) denote the process £ started at C £ Pfe ( res P- C £ Pfe+i)j an d call £,'(k) 
the projection onto P| of £()t+i). Since we work with finite state spaces and discrete jump processes, 
(ii) is equivalent to the fact that for all 7, 77 G P| the rate at which £^ n jumps from 7 to 77 is equal to 
the corresponding transition rate for By construction, the former is the sum of the rates of all the 
transitions from the current state of £(fc+l) to a state rf such that 77 -< 77', and so (ii) holds if and only if 
(i) does. 

The second part of Lemma [221 is a direct consequence of the consistency of the process. □ 
2.3 Limiting process on the slow time scale 

Let us now describe the form that we would expect the genealogical process to take on the slow time 
scale as the number of demes tends to infinity. This process V will have values in n„ , so once again we 
construct it by specifying its transition rates. 

Recall the two ingredients of the description of the structured genealogical processes indexed by 
D < 00, given in Section 11.41 Coalescence and movement of blocks to formerly empty demes are the 
two kinds of events that constitute the fast process £, and we saw in Lemma [2.11 that the final state 
reached by £ belongs to LT„ a.s. Therefore, we now need to describe how the resulting geographically 
separated lineages are gathered into identical demes and, potentially, merge during the same event. As 
in the definition of £, the rate at which such an event occurs will only depend on the number r of demes 
containing at least one lineage just after the event, on the numbers fei, . . . , k r of blocks brought together 
into these components, and on the number and sizes of the groups of blocks ending up in the same demes 
which subsequently merge into a bigger block. Hence, we shall use the following terminology. 

Definition 4. Let k > 2, and ki,...,k r > 1 such that YZ=i^i — ^ an d a t l eas t one °f the fc,'s is 
greater than 1. Let also L\ — {Zi,i, . . . , h.h}, ■ ■ ■ , L r = {l r ,\, ■ ■ ■ , lr,i r } be r (unordered) sets of integers 
such that for j G {1, . . . ,r}, we have Ylu=i h,u = We call an event in which k lineages spread in k 
different demes become grouped into ki lineages in one deme, lineages in another deme, . . . , and for 
all j G {1, . . . , r}, lineages in deme j merge into one, lj t 2 into another, and so on (all mergers occur 
between lineages which landed in the same deme) a (k; k%, . . . , fc r ; L\, . . . , L r )- geographical collision. 
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Remark 2.1. A geographical collision is to be understood as a particular transition. Because the order of 
k±, . . . , k r does not matter, a (k; k±, . . . , k r ; L±, . . . , L r )- geographical collision is also a (fc; fccr(i) , ■ ■ ■ , fco-(r) j 
L CT (i), . . . , L a ( r ))- geographical collision for any permutation a of {I, . . . , r}. Thus, for a given (k;ki, . . . ,k r ; 
Li, . . . , L r ), the number of (k; k±, . . . , k r ; L\, . . . , L r ) -geographical collisions is 



A(fc; ki, . . . , k r ) J^J A(fc m ; l m> \, ■ ■ ■ , I 



where if k,k\, . . . ,k r are such that $^ i=1 ki = k and bj is the number of ki 's equal to j, then 

Indeed, the binomial term gives the number of ways of choosing k± blocks to form a family numbered 1, 
k 2 other blocks to form family n°2, and so on. But any permutation of the labels of families having the 
same size gives the same unordered structured partition, hence the normalization by the fraction in the 
right-hand side of {]?[). 

Let us now define the structured genealogical process V. The relation between the coefficients of V 
and the sequence of structured genealogical processes will be given in the next section, and we simply 
give the form of the limiting process here. Recall that \C\ a = k if the a'th component of £ contains k 
blocks, and write |C| for the total number of blocks of £ € P^, that is |C| = J2a=i ICU- Furthermore, let 
C be a n„-valued random variable with the distribution specified in Notation [TJ 

Definition 5. For all integers and sets k, ki and Li satisfying the conditions of Definition \Q let 
tfc-ki fe -L\ l t — 0- Then, (Vt,t > 0) is the Markov process with values in Il„ which evolves 
as follows: when Vt — X G n„, any (|x|;fci, . . . ,k r ;L\, . . . , L r )- geographical collision occurs at rate 
^fx|-fei k L\ l ■ Given that this collision has outcome C £ Pf i; the new value of V is drawn from the 
distribution of £. 

We can recover the expression for the rate of any given transition in the form 

<I(V\X) = A f X |;Jk 1 ,...,fc ri i 1> ...,L f .C['7]. 

where the rate A? i. fci k r -L% L r m tne t erm °f the sum labeled by a given £ is the rate of occurrence of 
the only possible geographical collision turning x into C> h° such a collision exists. If it does not, we set 
the rate to 0. Consequently, the previous description does specify a Markov process on n n . 

Observe that this description allows 'ghost events' in which lineages are gathered in identical demes 
by a geographical collision and then scattered again in different demes without coalescing, so that the 
actual transition is of the form x ~~ * X- However, we shall need to keep track of these ghost events in the 
proof of convergence of the structured genealogical processes. Therefore, we shall always consider them 
as events which do occur at a certain rate but have no effect on V . 

To finish the description of our limiting process, we have the following result, which in fact describes 
the unstructured genealogy under some additional conditions. 

Proposition 2.1. For each £ € Tl n , let us define Q w € P Tl as the unstructured partition of n induced by 
£. Then the unstructured genealogical process {V^,t > 0) induced by V is a Markov process with values 
in P„. Suppose in addition that condition (i) of Lemma [Ql holds and that the X 9 's satisfy the fol 
consistency equations: for all k eN and compatible k\, . . . , k r , L\, . . . , L r , 



* I- ~ K+l-,ki,...,K+l,...,k r -,Lu...,L^,...,Lr + X k+l;ki,...,k r ,l;L u ...,L r ,{l}> ( 4 ) 

u=l j=l 

where for each u < r 

tU) — J 0«>i> +!)>••■ >lu,i u } if j < iu 

{/«,!,-•• ,lu,i u , 1} i£j = i u + l. 
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(In particular, if instantaneous coalescence after the gathering of lineages is forbidden, then the X 9 's are 
associated to a 'E.-coalescent.) 

Then {Vt,t > 0) is the restriction to P„ of a 'E.-coalescent on the partitions o/N. 

Remark 2.2. By fixing k, k\, . . . , k r and summing over all compatible integer sets L±, . . . , L r , we see 
that the rates X k . k kr at which k lineages lying in k different demes end up in a configuration where 
ki lineages are in the same deme, k 2 in another one, and so on (regardless of how many of them merge 
instantaneously thereafter), are associated to a 'E.-coalescent whenever condition ^ holds. 

Proof. Any component of 7 £ II„ contains at most one block and all n-tuples are denned up to a 
permutation of their components, so the map II„ — > P„ : 7 1— > 7" is a measurable bijection between II n 
and P„. Thus, V u inherits the Markov property of V, and its transition rates q u (rj u \ 7™) are obviously 
given by q w (r) u \ 7") = q(rj\ 7). 

Let us turn to the second part of Proposition 12.11 By assumption, V u only coarsens as time goes on 
and it is easy to check that all transition rates p(k; ki,...,k r ) from a partition with k blocks to a partition 
obtained by merging fci of those blocks into one, k 2 into a second one, ... (fci, . . . , k r £ N, = k), 

are equal and depend only on fc, k\, . . . , k r (the order of k\,.. . ,k r d oes not matter). Therefore, we 
need only check the consistency condition given in ISchweinsberg ( 2000h to identify V u as the restriction 
to the partitions of [n] of a S-coalescent. As the rates do not depend on n, let us rather work in II& 
with 7 = ({{1}}, • • ■ , {{k}}) and n = ({{1, . . . , A*}}, {{fci + fc 2 }}, . . . , {{fc x + . . . + k r ^ + 

1, . . . , k}}, 0, . . . , 0), and check that 

r 

p(k; fci, . . . , kr) = p(fc + 1; h, . . . , fc, + 1, . . . , Ay) + p(k + 1; fci, ■ . . , k r , 1). 

i=l 

Since the A 9 's satisfy (J4j) , we have 
p(k;ki,...,k r ) =q(r)\ 7) 

l,;L,,...,L, CM (5) 
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= E E E A W,..,i*+i,..,^ 1 ,..,4> ) ,.., is ^ ] + E ^i*,.,!,,!;^ i.,p}CM' 

CeP» 0=1 j=i ' ' C eP| 

We wish to compare this rate to the rates corresponding to k + 1 blocks. To this end, let us define 
O v ,j) £ Pfc+i f° r an C 6 P| w ith I non-empty components and v G {1, . . . , I + 1}, j S {1, . . . , i v + 1} (i v 
being the number of blocks in the v'th non-empty component of Q by turning ( into a (fc + l)-tuple and 
adding individual fc + 1 in the j'th block of the w'th component of the new vector {v = 1 + 1 means that 
we add the block {fc -I- 1} in the extra component, and likewise j = i v + 1 means that we add the block 
{fc + 1} in the w'th component of the new vector). For example, with the previous notation 7, 

7(1,2) = ({{1}, {fc + 1}}, . . • , {{fc}}, 0) and 7(fc+M) = ({{1}}, . . . , {{fc}}, {{fc + 1}}). 

Define also 7W £ II^+i, for all 7 £ Hk with r blocks and j £ {1, . . . , r + 1}, by turning 7 into a (fc + 1)- 
tuple and adding individual fc + 1 in the block of the j'th component of the new vector. Once again, 
j = k + 1 means that we add a block {fc + 1} in the extra component. For instance, 

- ({{l,...,fci,fc + l}},{{fci + l,...,fci + fc 2 }},...,{{fci + ... + fcr-i + l,...,fc}},0,...,0) 

and 

^(fc+i) = ({{1,..., kl }}, {{fcr + 1,..., fci + k 2 }},..., {{fci + ... + fc r _! + l,...,fc}},{{fc + l}},0,...,0). 



With this notation, we see that 

r 

^pik + 1; fci, . . . , k t + 1, . . . , kr) + p(k + 1; fci, . . . , k r , 1) = E 7(fe+i,i))- (6) 



=1 
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For all C G Pfe+i; there exists a unique triplet ((,v,j) where C G P^ has I non-empty components, 
v G {1, . . . , I + 1} and j G {1, ...,«„ + 1} such that C' = C(v,j)- Indeed, C is given by the partition of 
{1, . . . , fc} induced by v is the component containing k + 1 and j is the block of that component in 
which k + 1 lies. Therefore, the right-hand side of |(6|) is equal to 

E E EXXx,. ^ + ,...,,;, 1 ,..^,..^^ W]+ E E cc^w], 

i=i ceP^ »=i j=i i=i ceP| 

(7) 

where s and the coefficients A 9 m correspond to the particular C indexing the 

k+l;h,...,l v + l,...,l s ;L 1 ,...,Li 3 V * ° 

term of the sum. Let us look at a particular C in the second sum. The block {fc+1} remains a singleton just 
after the geographical collision, so it is not affected by a following genealogical event and C(s+i,i) [v^] = 

for all i G {1, . . . , r}. Lemma [231 hence implies that C(a+l,i) = CMj an d the second term of ([7j is 

equal to 

E ^k+l;h,...,l 3 ,l;L lt ...,L 3 ,{l} CM- 

Let us look at a particular ( in the first sum, now. When v < s, the corresponding geographical collision 
brings k + 1 in a block of the v'th component of C- By the second part of Lemma l2~2l the probability 
that the final state of all the blocks different from k + 1 is given by r\ is equal to the sum over all 
corresponding final states of these blocks and k + 1. But taking the sum over i in 53«=i Qv,j) [v^] boils 
down to considering all such final states, since C(v,j) i 1 !^} = if the individuals in the i'th block of r\ were 
not in the w'th component of C before their rearrangement by the genealogical process (recall that, under 
the action of £, lineages can merge only if they start in the same deme). Therefore, we obtain that, for 
all £ G P| and compatible v,j, 

i=l 

Coming back to expressions |(6|) and ([7]), we obtain 

r 

p (k + 1; fci, . . . , h + 1, . . . , kr) + p(fc + 1; fci, . . . , fc r , 1) 

i=l 

s »„+l 

E E E A fc+i ;il ... .,,.+!,. ..,i, ;il ,...,L«',....L. ^ M + E A *+i i j 1 ,...,i„i ! i 1) ... I i.,{i} 

= p(fc; fci , ... , fcr), 

where the last equality follows from |(5]) . This completes the proof of Proposition 12.11 □ 

3 Convergence of the structured genealogical processes 

Now that we have constructed the potential limits for our sequence of structured genealogical processes 
on the fast and slow time scales, let us state precisely what conditions we impose and in which sense 
V D -x and V D converge. 

3.1 Description of the conditions 

Let n > 1 be the sample size and define two types of events: 

• Type 1: some lineages contained in the same demes merge and some move (potentially in groups) 
to empty islands. The number of lineages involved in either step can be zero (meaning that only 
coalescence or only scattering has occurred), and lineages starting from different demes are not 
gathered into the same deme by the event. 
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• Type 2: k lineages move, but at least one of them lands in a non-empty deme or at least two 
dispersing lineages not coming from the same deme are gathered. During that event, fci lineages 
end up in the same deme, hi lineages in another, and so on. This is immediately followed by the 
coalescence of some lineages lying in identical demes (the number of such mergers can be zero, 
meaning that the lineages have only moved). 

By our assumptions on the genealogical processes, these two types describe all kinds of events which can 
happen to the structured genealogical process V D , for each D. For conciseness, we shall call an event 
of type i an i-event. Assume now that, when V B has value ( 6 and )j G is a possible new value 
compatible with the type of the event (o's hold as D goes to infinity): 

1. The rate of occurrence of a particular 1-event ( — ► 77 can be written 

r D^k(QM:n) + v(n) ^ »?) + °W as D -> °°> 
where for each n, v^ n '{-, ■) is a bounded function on (P^) 2 and r D — > 00 as D — > 00. 

2. Consider a 2-event involving k lineages, for which there exist k\, . . . , k r > 1 such that J2l=i k« = ICI 
and there exist r sets of integers L\ — . . . , h,ii}> ■ ■ ■ 1 L. r = {l r ,i, ■ ■ ■ , lr,i r } such that for all 
j € {1, . . . , r} we have 2«=i h,u — kj, satisfying: in the new structured partition, ki lineages end 
up in one deme, in another deme, . . . , and for all j G {1, . . . , r}, Ij.i lineages in deme j merge 
into one, lj t i 2 into another one, and so on (once again, all mergers occur between lineages lying in 
the same deme) . Then the rate of occurrence of any such event is of the form 

(n) 

where for each n and all k < n, 1% is a bounded function on xP^, and in particular if £ G II n , 

Here again, the order of k\, . ■ ■ , k r does not matter. 

3. The i?'s correspond to a structured genealogical process ^ as described in the last section, and the 
A 9 's satisfy the consistency equations |(4]). 

Morally, the coalescence of lineages occupying common demes and the scattering of such lineages into 
empty demes occur more and more rapidly as D tends to infinity, whereas events collecting lineages into 
common demes occur at bounded rates. Other events are less and less frequent, so that in the limit we 
obtain a separation of time scales between the instantaneous structured genealogical process and the slow 
collecting phase of the limiting unstructured genealogical process. Notice that 1-events do not affect a 
structured partition contained in n„. 

Let G n ' D denote the generator of the genealogical process of a sample of n individuals when the 
number of demes is D. For each D, the domain T>(G n,D ) of G n,D contains the measurable symmetric 
functions of n variables (by symmetric, we mean invariant under all permutations of the variables). From 
the last remark, we see that for all / G V(G n ' D ), the parts of G ,hD f corresponding to 1-events vanish on 
n n . Furthermore, we can define linear operators T n and such that G n ' D has the following form: 

G n ' D = r D ^ n + T™ + R n D . 

More precisely, for every function / as above and each ( e P* , we have 

* n /(0= E and r«/(C)= E {^ n) (C,ri) + l { k n \Cv)}(f(ri)~f(0), 

and by the nonnegativity of their coefficients, these two operators can each be viewed as generating a jump 
process independent of D. In particular, we can define the structured genealogical process £ on as the 
process generated by The remaining terms o(l) in Assumptions [TJ and [2] constitute the coefficients of 
the (not necessarily positive) operator Rp, and so if we again use the operator norm introduced in J2]), 
the finiteness of the number of possible transitions guarantees that (R^,) = o(l) as D — > oo. 
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3.2 Convergence of the structured genealogical processes 

The main result of this section is the convergence of the finite-dimensional distributions of the -valued 
structured genealogical processes V D to the corresponding ones of V, except at time t — 0. The difficulty 
stems from the fact that the sequence of generators G n,D is unbounded because of the fast genealogical 
events driven by V E'™. The proof consists in essence in showing that the dynamics of the genealogical 
processes become very close to the description of the dynamics of V, in that for D large enough, once a 
r n -event (i.e., a geographical collision) occurs, enough ^"-events happen in a very short period of time 
to bring the structured partition back into II n . During that short period, the probability that a T n - or an 
i?J,-event occurs is vanishingly small so that at the time when V B re-enters H n , with a high probability it 
has the distribution of the final state of £ started at the structured partition created by the geographical 
collision. Overall, i?^,-events are more and more infrequent and do not occur in the limit. 

Before stating the results of this section, let us define the probability measures of interest. We take 
for granted the fact that the processes £ and V D for each D € N and all n e N can be constructed on the 
same probability space (fi,P,.F), For all C 6 we thus denote the probability measure under which 
these processes start at £ by Pj . Likewise, let (fi',P, T 1 ) be the probability space on which the processes 
V and x ( see Definition [6]) are defined for all n 6 N. P^ denotes the probability measure under which 
these processes start at rj S II n . 

With this notation, Theorem [TTT] can be restated as: 

Theorem II. If . Suppose that the conditions stated in Section \3. 1\ hold, and let £ € P^. Then, the 
structured genealogical processes V D started at £ converge to the process V started at ( as D tends to 
infinity, in the sense that for all < t\ < . . . < t p , 

P ( (Vf, ...,V t D p )^ ¥ L (V tl ,. . .,V tp ) as D - 00, 

where Pf (X) stands for the law of the random variable X under P^ and Pf (X) is defined similarly. 

We also have the following result. 

Proposition 3.1. Assume again that the conditions of Section \3. 1\ hold. Then the sequence of Dps [0, 00)- 
valued processes {V D -i ,t > 0} converges in distribution to the structured genealogical process £ introduced 

r D t 

in Section fOl 

The proof of Proposition 13.11 is a direct consequence of the uniform convergence of the generator of 
V U -x (namely r^G n,D ) to the generator of £ and the finiteness of the state space. A coupling with 

£ shows that the first time at which both processes differ when started from the same value tends to 
infinity in probability, which is the main argument to obtain the desired convergence. The proof being 
immediate, we turn instead to the proof of Theorem ll.il 

Let us first introduce the following notation, for each DeN: 

erf = inf{t >0:? f D 6n„}, if = inf{t >crf : a 2-event occurs at t}, 

and for all i > 2, 

erf = inf{t > rf 1 : V? £ II„}, if = inf{t > erf : a 2-event occurs at t}, (8) 

with the convention that inf = +00 and if erf or rf = +00, then the following random times are all 
equal to +00. Note that if a 2-event occurs, its outcome may still be in fl„ (if all lineages gathered in 
identical demes merge into one lineage in each of these demes). In that case, <tR_ 1 — rf . Let us also 
denote the ranked epochs of events occurring to the process V by cr;, i > 1, including what we previously 
called the 'ghost events', with the conventions that a\ — and <7k = +00 for k > j + 1 if there are no 
more events after the j'th transition. 

Proof of Theorem li.il We start by proving the convergence of the one-dimensional distributions, then 
establish the convergence of the finite dimensional distributions by induction on their dimension. Since 
the sample size is fixed, we drop the superscript n in our notation. 
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As a first step, let us state the following definition and two lemmas, which will be useful in the course 
of the proof. For the sake of clarity, the proofs of the lemmas are postponed until after the proof of 
Theorem O 

Definition 6. Let (xt,t £ [0, T)) denote a H n -valued Markov process generated by T n , where T is defined 
as 

T = inf{t>0: xti^n}- 

Then, for all rj 6 II„, x(v) * s defined as a P^-random variable distributed like the outcome of the first 
geographical collision when \ starts at rj (this event is always defined if the X 9 's satisfy and rj has at 
least two blocks, since the coefficients X 9 are the rates of a "E-coalescent as mentioned in Remark \2.2\) . 

Lemma 3.1. Let i > 1. Then for all bounded measurable functions f on R+ x Pf t , we have 

Jim^f/fcf,^) Vf<oo}] = Mfi^r^) I {o , <oo} ], 

Jim^f/trf I {T » <OD} ] = E L [f{a i+1 , X {V ai )) I {CT!+1 <oo}]. 

In particular, by taking f(t, rj) = I{t<s} for all s > 0, we obtain that the law underP^ of the [0, +oo]-valued 
random variable af (resp. rP ) converges to the law under of <Ji (resp. <7i+i). 

Lemma 3.2. Let t £ (0, oo) and let i G N be such that P^ [cr, < oo] > 0. By Lemma [3J\ we also have for 
D large enough P^[af < oo] > 0. Let f be a real-valued function on P*. Then 

D lim ) E c [/(P t D ) I[ CT fVf)(i) | aP < oo] =E £ [/(P t ) ^ iw) (t) | «r< < oo] . 



Fix t > 0, let / be a real- valued function on P* and denote the supremum norm of / by 
have for each D and all N £ N: 



We 



E c [/(P t D )] -%[/(^)] 



e c [E /(^) V-i.*f) (*) + E /(O 

i=l i=l 

AT 

< e |e c [/(^ d ) v,n D )(*)] w« + o(*)] 

iV 

+ E E c[|/(^)| Vx.-f)(*)] + MfCP t D )ht>r° } 



(9) 



E C [/(^)I{ 



t>CT N + i}J 



where ro = 0. Let e > 0. The random variables o~i are the jump times of V , the rates of which are 
bounded above by a constant b > 0. Thus, for each N > 1, ctn is bounded below by the sum of N 
independent exponentials with parameter 6, and so there exists N > 1 such that 

P<[o>r + i < *] < 



4||/|| 



In addition, => (Tat + i by Lemma l3TT| so there exists a Dq such that for Z? > Dq, 



Pd^N+i < oo] < 



4ii/ir 



Consequently, for D > Dq we have 



< 



P<[r£ <t]+P c K+i <t]) < -. 



E c [f(VP)l {t > T » } ] + Ec[/CP t )I {t > CTJV+l} ; 

Let ie{l,...,JV}. We have 

E c [|/(7>f )| I [T ^ ll0f )(*)] < 11/11 Pc[r^i < t < of] = ll/H (P C [t£ x < t] - P c [o? < t]) 



(10) 
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By Lemma [3J] both t®_ x and o\f converge in law towards cr, (whose distribution function is continuous 
on R+), so the right-hand side of the last inequality tends to when D — > oo. Hence, there exists a D\ 
such that for all D > D±, 

N 

e 

T 



(ii) 



i=l 

Once again, let i G {1, . . . , N}. If PJct, < t] = 0, then I[ CTi ,<r i+1 )(*)] = and 

V ( [f(T t D ) I K P | < ll/ll PcK D < *] -0 

as £) tends to infinity, by Lemma 13.11 and the continuity of the distribution function of cr, in t. If 

3P(j [<7j < t\ > 0, we also have \af < t] > for D large enough, so we can write 

E ( [f(V t D ) I K , Tr) (t)} = E c [/(^) l K >Tf) (i)| af < oo]P f K D < oo] 
-^Ej/CPt) E [CTi , CT<+l) (t)| o-j < oojP^ < oo] 
= E c [f(V t )I [ai , ai+l) (t)], 



where the convergence on the second line stems from Lemma 13,21 and the convergence in distribution of 
af towards &{. Consequently, there exists D 2 such that for all D > D 2 , 



N 



J2 Ec[/(^f) V.-« D )(*)] Wm)(*)] 



i=l 



< 



(12) 



Combining to (JSJ, ([TO]). (fTTj) and fT2"}l. we obtain for all D > max{D , D 1: D 2 } 



E C [f(V t D )] -E L [f(V t 



< e. 



We can hence conclude that 



lira V ( [f(V t D )] =E c [f(V t )}, 



which completes the proof of the convergence of the one-dimensional distributions of P D under to 
the corresponding ones of V under . 

Let us now turn to the convergence of the finite-dimensional distributions. We prove by induction 
on p that, for all < t\ < . . . < t p , Pc(7>£, • ■ • , 7>£) ^dV tl , ■ ■ ■ ,V tp ) as D -»• cxd. By the preceding 
step, the case p = 1 is already established. Let p > 2, and suppose that the convergence holds for the 
(p — l)-dimensional distributions. Let < t\ < ... < t p , and let fi, ■ ■ ■ , f p be real- valued functions on 
P*. We denote the a-field generated by {Pf , s G [0,*]} by T t D . Then, 



Ec[lI/i(^)]=Ec[E[n/,(^) 

i=l i=l 



by the Markov property 



i=l 
_p-l 



where here and in the following X denotes an independent version of the random variable X, the second 
expectation is taken with regards to X, and p D (-, ■, s) is the transition kernel of V D corresponding to 
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time s. Continuing the preceding equalities, we obtain 



p-i 

E /pM ^[n^w^^p-x^-^-vi) n n} 

p-1 



»;6Pf, 



(13) 
(14) 



For all 77 e P», 
p-i 



P - i<p-i) ^{vp £ n„} 



p-i 



<(nii^ii) p c[^-i^ n «]^° ( i5 ) 



by the convergence of to Vt p _ 1 in distribution and the finiteness of P* . As the sum in 1(13)1 is finite, 

((15)1 implies that this sum tends to when D grows to infinity. Moreover, the convergence in law of 
V®_ tp _ 1 to Vt p -t p -i, the finiteness of and the fact that 7 = 7 a.s. if 7 £ II„ enable us to write 



max max \p D {^1, r), t p — — ^(7, 77, t p — t p -i)\ — ► as £> — > 00, 



(16) 



where p(j, 77, t p — t p -x) is the transition kernel of V corresponding to time t p — t p -i, extended to 77 ^ II n 
by p(j, Tj, t p — t p -i) = 0. Now, we have for all 77 £ Pf> 



p-i 



E C [ II MK) V D (V»_ v ^t p -Vi) n, 

i=l 

p-1 

= ^<[Y[M^u) (p D (^_ 1 ,»7.tp-^-i)-K^-i.»7.*p-^-i)) Vf^e n„} 

i=l 
p-1 

+ E C [JJ pPS^mtp-tp-i) n„} 



(17) 
(18) 



The expression in (fl7|l tends to by 1(16)1 and dominated convergence. As for the quantity in 1(18)1 . for 
each 77 € P* the function 7 1— > ^(7, 77, i p — i p _i) I{ 7£ n„} (vanishing on P* \ il„) is necessarily continuous 
and bounded on the finite set P* , so by the induction hypothesis for p — 1, we have 



J™, Ec [ 11 M V u) P(K-n V, tp - tp-i) hvp^ e n„}] = % [ II *( p *<) P^Vn *p ~ *p-0 



The two latter results, together with (fl"4)) . (fl"5)l . the finiteness of the sums and the Markov property 
applied to V lead to 



p-i 



i=l 



i=l 



:[n/i(^) 



As any real-valued function on (P^) P can be obtained as a uniform limit of product functions, the 
convergence of the p-dimensional distributions is proven. The proof of Theorem 11.11 is complete by the 
induction principle. 
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□ 

Proof of Lemma HOI Let us start by proving that erf converges in probability to 0. If £ <E IT n , then 
erf = a.s. for all D so the convergence trivially holds. If C ^ n„, then erf > a.s. and with the 
notation introduced previously, we have for each function / on 

G D f(C) = r D */(C) + 17(C) + R D f((), 

where ^/(O ( m fact, this holds for any 77 ^ ±1„, and consequently for all values of V? , t € [0, erf)). 
Let us write 7\dc*(£) (resp. cr(C), c ftr>(0) the total rate of the non-trivial events generated by ro^ 
(resp. T, Rd) when G D f is applied to (. As events are discrete for each D, we can write for s > 

PC [ of > •] 

< [at most ro 'J— events and then a T— or event occur in [0, s] and V D ^ IF,, Vu 6 [0, a]] 
+P^ [at most n ^—events and no r— or Rd~ events occur in [0, a], and 'Pf ^ IT n Vu G [0, s]] 
+P,j[more than n "J— events occur before the first T— or Rd— event]. (19) 

Since the events generated by \& correspond to the structured genealogical process (£t,£ > 0) started at 
C as long as no T- or i?_o-events occurred, by the bound on the number of transitions of £ (n, see the 
previous section), the third term on the right-hand side of lfT9|) vanishes. Moreover, the probability that 
the next event generated by G D is a T or an i?£>-event when the current value of V D is 77 £ H n is given 
by 

ct(v) +cr d (77) 



cr (rj) + c Rd (77) + r D c* (77) 



0, D 



since 0^(77) > for such an 77, and this is precisely the kind of situation required to be in the configuration 
given by the first term of (fl9|l . So by bounding this term by the maximum over 77 ^ IT n of the probabilities 
calculated just before, we obtain that the first term of (fl9]| tends to as D grows to infinity. To finish, 
for each D and all k £ {1, . . . , n} let us call UP the random time of the fc'th event occurring to V D , with 
the convention that l/P = +00 if there are less than k such events. If k events occur (i.e., UP < 00) and 
V D stays out of IT n , then Uk+i is stochastically bounded by the sum of k + 1 i.i.d. exponential variables 
with parameter td min^n,, c *( 7 ?), whose distribution becomes concentrated close to as D grows since 
mm r)^n„ c *( 7 ?) > 0- Consequently, 

Pj [ exactly k \f— events and no T— or Rd~ events occur in [0, a], and V D ^ II„ Vu € [0, a]] 
< P c [[/■£_! > a, CTjP < 00 and V D £ U n Vu £ [0, a]] -> 0. 

As the second term in (fl9|| is bounded by the sum over k 6 {0, . . . , n} of the preceding quantities, it 
converges to zero. Hence, P^[erf > s] — > for all a > and o\ — > in probability. 
Now, let / be a function on . For each s > 0, we have 

E C [/(P£) <s} ] - E c [/(p£ )] -E C [/(P£ ) I {CTf > s} ]. (20) 

By the convergence in probability of erf to and the fact that / is bounded, the second term in the 
right-hand side of lf20|l vanishes as D grows to infinity. Furthermore, we have 

MfiKp )] = E c[/(^f ); onl y ^-events before af] 

+ E c [f(V D n )', at least one T-or R D -events before erf] . (21) 

The second term in (f2lj) is bounded by ||/||P^[at least one T— or Rd— events before erf] which tends to 
by the preceding calculations, giving as a by-product that Pq [only 'J— events before erf] > for D 
large enough. Moreover, when only '5— events occurred between and erf, then the evolution of V D 
between these two times is driven by the structured genealogical process £ started at £, so P D D has the 
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same distribution as C- Thus, 

Ec[/CP^p) ; only ^-events before of] 

= E ( [f(V®p)\ only ^-events before erf ]P ? [only ^-events before erf] 
= E [/(C)] Pc [only ^-events before erf] 
-E [/(£)]. 

Together with $2Q§ an d <EU , we obtain that 

JimE c [/(7>£) ! {<<s} ] = E[/(C)] = %[/(Po) I {0<s} ]. 

A monotone class argument of enables us to conclude the same result for any bounded measurable function 
/ on R+ x P*. 

Let us now investigate the convergence of rf. Recall that if 77 € H n and / 6 V(G D ), then 

G D f(r 1 )=Tf(r,)+R D f(r 1 ). 

If a > 0, by the strong Markov property applied to V D at time erf we have 

P c [rf > a] = E C [P^ [ff > a - af ] I {s>fff } ] + E c [P v » [ff > a - erf] I {s < CT? <oo} ] . (22) 
"1 "1 

The second term in lj22|) is equal to P^[s < erf < 00] which tends to when D grows to infinity. If a 
r-event occurs when the current value of V D lies in II n , it is necessarily a 2-event, hence the first term 
is equal to 

Ec [ ^{s><j d } ~P-p D [ n0 T~ or Rd~ events between and s — erf]] 

1 "\ 

+ E^ [lr s>0 .D\ P p d [no T— events and at least one Rd — event between and s — erf; ff > a — erf]] . 
1 "1 

But for all 77 £ n n , 

P,) [ no T— events and at least one Rd— event between and s — erf; ff > s — erf] 

< P, ; [no T— events and at least one Rd— event between and s] 

< 1 - cxp ( - s max c Rd (7)) -> 0, 

so by dominated convergence, 

E,j[l| s>0 .D} P p d [no T— event and at least one Rd~ event between and s — erf; f f > s — erf]] — * 0. 
Consequently, 

P c [rf > a] = E c [l {s>CTf } exp-{( Cr (^ f ) + c Rd (V%)){s - erf)}" 

by the preceding convergence result for (erf ,V%j>) and the uniform convergence of cr d towards 0. We 
can thus conclude that the law of rf under P^ converges to the law of er 2 under P^. 
Now, by the strong Markov property applied to V D at time erf, we have 

E C [ I KD<S} f(V?») 1 {T » <oo} ] (23) 
= E^ [ I^ CT D <00 j.EpD D [ I{ f D <s _ (T p} f(P? D ); the first event is an Rd— event]] 

+ E c[ l {a o <oo} E V D D [ l {f D <s _ a D } f(P?p); the first event is a r-event]]. 
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The absolute value of the first term in the right-hand side of lf23|) is bounded by 



ll/H max P v [a first event occurs and is an Rjj— event] — » 0. 

i|£ll„ 

Moreover, if Vq = r\ G n„ and the first event is a F-event, then -Jf is the time of that first event and 
V? D its outcome. Therefore, both are independent and VP D is distributed like xiv), so the second term 
in lj23|) is equal to 

E C [ I { D<oo} E„ C < D [I { ,o <s _ a n } f{ X {P°)) ]] +0(1) 

= E C [ I {CTf <oo} P pD [ff < a - of] E^ [/(x(^))] ] + o(l). 

Let us write 

P v o d [ff < s - of] - [rf < a] - P v o d [ff G [* - of , fl ]] 

"i "i "i 

and fix e > 0. For any 5 > 0, we have 

E f [i {CTf <oo} [ff e [s - of, fl ]] e^ d [/(x(n D ))] ] 

< E ( [I {<<s} P v o d [ff G [*-*,*]] E^[|/( X (^))|] ] + ll/H P c [of G [<5,oo)]. 
By the convergence in probability of of to 0, there exists Dq > 1 such that for all D > £>o, 

P c [of G [<J, oo)] < 



311/11 



Let 77 G Il„. By the continuity of the distribution function of 02, there exists So > such that 

e 



Pr,[<5- 2 G [S-£ ,S]] < - 



3||/| 



< 



311/11' 



In addition, ff converges in distribution to 02, hence there exists D\ > 1 such that for all D > Di, 

P n [ff G [s - o" , «]] - Pt, [cr2 e [s - <5 , 4 

Since n„ is a finite set, we can conclude that for S > small enough, and D large enough, we have 
E C [ l {<J?<s] P v o D [ff G [a -S,s]] E pD [\f( X (VS , ))\] ] + 11/11 PJof G [5, 00)] 

< E C [ I { D<5} max P, [ff G [* - S, a}] E v o [|/(x(^))|] ] + { 
<e. 

Now, T) 1 ^ P?)^/ 3 < s] converges uniformly in 77 G n„ to r/ P^[o2 < s] and 

(5,77) 1 ► I {s<oo} I { ^ n „ } P^[cr 2 < s ]K n [f( x (r))] 

is a bounded measurable function, so by the convergence in distribution of (of ,7> D n ) proven above, for 
D large enough we have 

E C [ I {TiD<s} f(V?n) l {T o <oo} ] -%[I {CT2<S} f(xCP*x)) 

E c [I {rf<s} f(P°>) I {T n <oo} ] -E i [l {ffl<oo} Pp CT Ja 2 < s] E Pcti [f( X (P ))]\ 
<3e. 

Letting e tend to zero yields the desired result (once again by invoking monotone classes) and completes 
the step i = 1 of the proof of Lemma 13.11 
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Suppose that the desired properties hold for i — 1. Let / be a bounded continuous function on R + x P* . 
Since I{ ct d <00 } = l^p <oa yI^ T p i<oc y, the strong Markov property applied to V D at time t^_ 1 gives 

Mf(°?>Kp)h«? <oo}] - E c [l {Tf i<oo} E^ [/(r^ + of ) V f<oo} ]] 

But, for all 77 £ LT n , if X denotes a random variable whose distribution under P n is that of 77 (e.g. £ T7r in 
the notation of Proposition ^. ip . then 

|E„[/(f + d?,Vf ? <oo} ] - E„[/(t, X)]\ < \E v [f(t + a?,V?o) I { * f <oo} ] - E„[/(t + of ,X) <oo} ] 

+ |E,[/(< + af,X) I{5 f<oo} ]-E ?7 [/(t ) A-)]|. 

Since VP D has the same distribution as 77 under P^ if only ^/-events occurred between and of , the first 
term is equal to 

\E n [l {a n <oo} (f(t + of ,fi?p) - /(t + df ,X)); not only ^-events between and of] | 

< 2||/|| max Pn[of < 00, not only ^—events between and of] — > 

by the calculations done in the proof of the convergence of of. Moreover, 

E„[/(t + of ,X) % f<oo} ] - E n [f(t,X)} -» 

uniformly in 77 by the convergence in probability of of towards and the finiteness of the number of 
states that X can take. Therefore, 

V v [f{t + a?,V? ? ) l {a n <oo} }^E v [f(t,X)}=E n [f(t,Vo)} 

uniformly in (t,rf). This uniform convergence (which trivially holds also for 77 G LT„ since 77 = 77 and 
of = a.s.), together with the induction hypothesis for i — 1 yields 

Jim^E c [l {Tfi<oo} E^ ^ [/(rf x + of, Vf ? ) <oo} ]] = If [l^^E^^f/^, X)]]. 

But, from the description of the evolution of V in terms of the geographical events followed by the 
instantaneous action of the structured genealogical process £, we see that the random variable xC^V»-i) 
is distributed precisely like T' (7i (if Oj < 00). Consequently, 

lim E c [/(of,pf P )I { D<oo} ] =E c [I {(74<oo} /(o i ,7' ff4 )]. 

The same technique applies to (rf ,V D p), where this time we use the strong Markov property at time 
of and the following convergence result: 

E^[/(t + rf,P^) I {TiD<oo} ]^E 7) [/(t + o 2 ,x00) I{. 2 <oo } ] 

uniformly in (£,77) G R+ X n„, where under P^, Y is a.s. equal to 77. 

□ 

Proof of Lemma \3.2l If an event occurs in the (random) interval [of ,rf ), the first such event can be 
neither a 'J'-event since V^ D G II„, nor a T-event since V B would then undergo a 2-event before time rf , 

contradicting the definition of rf , so it must be an i?u-event. Consequently, if we write 

E ( [f(V t D ) Vf,rf)(*)| °? < °°] = E c[f(' P t D ) \o?.T°){t)- nothing happens in [of ,t}\ of < 00] (24) 

+ E^ [f(Vf ) l|„D iT D)((); something occurs in [of ,t]\ of < 00] , 
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then the absolute value of the second term of the right-hand side of |24|) is bounded by 

ll/H Pf [of < t and an Rd~ event occurs in [a i ,t]\ of < oo] < ||/||(l — cxp(— t ma,xcn, D (r]))) — » 0, 

where in the exponential the maximum is over r\ £ and recall that cr d (77) is the total rate at which 
i?D-events occur when the current value of V D is rj. 
The first term of lf24l) is equal to 



E ? [ f(P® p ) I^d iT p)(t)', nothing happens in [of ,t]| of < 00] 

= E c [f(V° P ) Vf,rf)(*) I of < 00] -E C [/(^ ? ) Vf,rf)W ; somcthin g happens in [of of < 00] 
As before, 

E c[/Cf^f ) Vf )(*): something happens in [of of < 00] < ||/||(l - cxp(-< maxcH D (?j))) 
and furthermore 

K<W%>) VVf)W I ^ < °°] =E c [/(^f D ) I {CT? < t} I of < 00] -E c [/(7>£) I {r ^< t} I of < oo]. 

(25) 

On the one hand, by Lemma l3~Tl and the fact that P^[cf < 00] — > P^[o^ < 00] > 0, the first term in (|25| 
converges as Z? tends to infinity to 

h^<t} I £T< <00]. 

On the other hand, by the strong Markov property applied to P D at time of, the second term in 
(|25f is equal to 

E C [f{V% ) P v o d [ff < i] I of < 00] . 

The function r\ 1— » P,, [ff < t] converges uniformly in 77 6 II„ to 77 1— > P,, [o 2 < t] , so by Lemma 13.11 we 
obtain 



D 



lim E C [f(V%>) P v o [ff < i] I of < 00] = E c [f(V ai ) W v [er a < t] | o, < 00] 



= E c [/(7^) I {CTi+1 < t} I ff* < 00] 



by the strong Markov property applied this time to V at time Oj. Combining the above, we obtain the 
desired result. 

□ 

The results obtained i n this section are similar in spirit to perturbation theorems such as Theorem 
1.7.6 in Ethier and Kurt d 1 1986t ). Indeed, in our case the existence of a projector p corr esponding to \fr and 
the c onvergence of the semigroup required (see condition (7.12) and Remark 1.7.5 in Ethier and Kurt j . 
19861 . p. 39) easily follows from Lemma I2TT1 and the finiteness of P* . Furthermore, the existence of a limit 



for r^G D is obvious from the form of G D . However, condition (7.17) of Theorem 1.7.6 requires the 
existence of a subspace E of functions on P* such that for every / G E, there exist functions g, fi, fa, . . . 
satisfying 

||/-/ D ||^0 and \\g-G D f D \\^0 as D 00. 
The condition on (/d)d>i and the finiteness of P^ yield 

G D f D = r D ^f + o{r D ), 

implying that a corresponding function g can exist only if tyf = 0. Although ^/(C) = if £ G n„, 
this condition would also require that f(() = whenever ( £ U n . Hence, to fit into Ethier and Kurtz' 
framework, an obvious candidate for E would be 

^={/:/(C)=0forallCeP^\n„}, 
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where we then define a bounded linear transfo rmation p n : P* — > LT n such that p n (v) — V f° r every 
i] G n„. We may then apply Theorem 1.7.6 of Ethier and Kurt d ( 1986h and obtain convergence of the 
semigroup corresponding to (or equivalently here of the finite dimensional distributions of) p n ^P D ) to 
that of V . However, it is unclear how to define p n on the set \ n„, that is to specify how to project 
P* down onto its subset II„, in such a way that the operator {(/ o p n) (T n op)(f o p„)), / : P^ — > P* } 
generates a Markov process. Unfortunately, unless this condition is satisfied, Theorem 1.7.6 cannot be 
used to deduce the convergence result given in our Theorem ll.il 



3.3 Tightness 

The convergence of the finite-dimensional distributions relies on the fact that the time required for the 
process to re-enter Tl n following a geographical collision is vanishingly small as D tends to infinity. On 
the other hand, multiple changes to the configuration of the genealogy can occur during this short period 
with high probability, so that the conditions for V D to converge as processes in D P s([0, oo)) are much 
more delicate. 

Recall the definition of the stopping times af and rf given in ((Sj) . Suppose that the probability that 
a 2-event results in a configuration not in n„ vanishes as D — > 0, or equivalently that 



lim P^rf <oo, V?n iU 2 ] 



0. 



(26) 



where C = ({{1}}j {{2}}) and the equivalence stems from the consistency equations |(4|). Then, we easily 
see that the first time r after erf such that V D ^ n„ converges to +oo in probability. Since 



G D f(r,)=Tf(T 1 )+o(l) 



as D 



if r/ e n„, we readily obtain that for any a > 0, the sequence of processes ({Vf,t > a}, D > 1) is tight 
(recall that converges in probability towards 0). Let us now show that if condition (|26|) does not hold, 
the sequence V D is not tight. It will be easier to work with a metric on P* , the associated topology still 
being the discrete topology. 

Proposition 3.2. Assume that $26)) does not hold. Let d be a discrete metric on P* , and suppose that 
( £ is such that ¥(\o2 < oo] > 0. Then the sequence of processes V D under P<^ is not tight in 
Dps ([0, oo)) endowed with the Skorokhod topology corresponding to d. 

Proof. First, recall the definition of the modulus of continuity w' given in lEthier and Kurtd()l986h . p.122: 

for X G Dps ([0, oo)), 5 > and T > 0, 

w'(X,S,T) 



inf max sup d(X s ,X t ) 
* s ,te[ti-i,ti) 



(27) 

to < ti < ■ ■ ■ < t„_i < T <t n such that 



where the infimum is over all finite sets of times of the form 
mini<i<„(ti — > 5 and n > 1. 

Suppose that the sequence V D is tight. P^ is a finite set, so the discrete topology on (P* , d) turns it 
in to a complete and separa ble metric space, therefore V D is also relatively compact. By Corollary 3.7.4 
of lEthier and Kurt3 ( 1986h . this implies that for every 7 e P^ , all 7] > and T > 0, there exists S > 
such that 

limsupP 7 [w'(P D ,(5,r) > 77] < rj. (28) 



Besides, the finiteness of P* guarantees the existence of e > such that, if 7 ^ 7' ePJ, then d(j, 7') > e. 
Let T = 1, 7] e (0, e) and 5 G (0, 1). We have 



P c [w'(P D , 8, 1)> V ]= P C [w'(V D ,S, 1)> V 

+ P c \w'{P D ,S,l) > 77 



1 " 2 



r? > i 
. 1 " 2 



<2 



<2 



(29) 
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O n the one hand , erf c onverges to in probability and rf 



[ —r er 2 , so by Slutsky's lemma (see Lemma 2.8 
van der Vaart . 1998h rf — erf =>■ er 2 , which is an exponential random variable with positive parameter, 



m 



so we have 



< 



erf < - and rf o r 



< 



cr 2 < 



1 

GJ 

C > 



< 



erf > — and rf — a\ 
o 



< 



since the last term in the second line vanishes by the convergence in probability of erf to 0. On the other 
hand, 



w'(V D ,S, 1) > r\ 



1 

<2 



w'(V D ,5,l) > r) 



- 1 D 

< 2' ff2 

< 6 -, v» ? i n„ 



rf < -, p£ £ n„ 



<2 



(30) 



By the convergence in probability of erf to 0, uniformly in r\ G Pf„ and the strong Markov property 
applied to V D at time rf, we obtain that erf — rf converges in probability to 0. Furthermore, on the 
event that no Rd~ events occurred between the times erf and rf (the probability of which is growing to 
one), rf is the epoch of the first event after erf and V^ D its outcome so, by the strong Markov property, 

rf and V® D are independent conditionally on T® D . Since ([26)1 does not hold, we can write 



lim inf Pr 



n 



rf < -, v% £ n n 



rf < 



> 



liminfP c [pf D ^n„l > 



Now, if rf < |, ^ n„ and erf - rf < §, then by definition of e, 

d(P£_,P£)>e and d(7>£, P%) > e 
and by assumption erf — rf < I, so w'{V D , <5, 1) > e > 77. Consequently, 



lim inf P 



w'(V D ,S, 1) > 77 



<2 



> 0. 



Therefore, we see from |J2S| that, for all 5 G (0, 1) 



lim inf P c [w'(P D ,5,1) > 17] > C' > rj 
for any 77 G (0, e A C"). This yields a contradiction with f28|) . 



□ 



>From the last proof, we see that what prevents the sequence of structured genealogical processes 
from being tight is that at each geographical collision, at least two jumps accumulate: the geographical 
collision itself and one or more transitions generated by £ to bring V D back into H n . 

Yet the unstructured genealogical process, which is not a Markov process for D < 00, is not modified 
by movements of blocks. Thus, if the number of jumps needed by V D to re-enter n„ after a geographical 
collision were at most one with a probability growing to 1, we would expect tightness to hold for V D ' U 
(recall that £" denotes the unstructured partition generated by £). The next proposition in fact gives an 
equivalence between the behaviour of the latter probability and tightness of \V B ' U ,D > l}. 

Proposition 3.3. For each D E N, let C/f be the random time defined by 

U?=M{t>0: Vt^V?}, 

with the convention that inf = +00. Note that, if Vq ^ n„, then L/f < erf. Let also x(II n ) denote the 
image of Tl n by the first geographical collision ( when it exists ), that is 



X (IT„) = { 7 G P* : 3 C G n„, P[ X (C) - 7] > 0}. 
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Suppose that for all 7 ^ Tl n , P["f u 7^ 7"] > (meaning that the process £ started at 7 has at least one 
coalescence with positive probability). 
Then the following are equivalent: 



(i) For all 7 S x(H„) \ Tl n , limo^ P 7 [E7f = of] = 1. 

(«) For aZ/ ( 6 and a > 0, ifte sequence of Dp n ([a, oo))-valued random variables V ,u , started at 
C" at time 0, is /jig/i/;. 

Furthermore, if C S Il n Ux(Hi), i/ien condition (i) is equivalent to the tightness in Dp n ([0, 00)) 0/ 
■P D < U siarted at C . 

As a consequence of Theorem M.li if conditions (i) and (ii) hold, then for all ( £ and a > 0, the 
law of (P t t > a) under Pq converges to the law under P^" 0/ (P", i > a). Furthermore, if ( £ 
iften £/ie convergence holds for a = 0. 

Remark 3.1. Assuming that for all 7 ^ II„, P[7" 7^ 7"] > is actually not required, but not supposing 
it makes the proof unnecessarily more involved. 

Proof. Once again we work with a metric d on P„, so that Dp„([0, 00)) is a complete and separable 
metric space and the sequence (V D ' u )d>i is tight if and only if it is relatively compact. We call e > the 
minimum distance between two different partitions. Let us first show that if condition (i) is not fulfilled, 
then neither is condition (ii). The following proof is highly reminescent to the proof of Proposition 
13.21 so let us adopt directly the same notation. In particular, we work with T — 1 and ( such that 

Pc[cr 2 < 00] > 0. 

For each a > 0, let us write w' a the modulus of continuity of a process corresponding to times t > a, 
defined as in ([271 with the condition on the finite sets {U} replaced by a — to < ■ ■ ■ < T < t n . Fix 
a e (0,1/3), and let r\ G (0, e) and S £ (0,1). The same calculation as in the proof of Proposition 
E2] holds by replacing the event {w'{V D , 5, 1) > 77} by {w' a (V D ' u , 5, 1) > 77} and Pci T i < V 2 ] h Y 

[1/3 < r-j < 1/2]. Hence, by f30|) and the argument directly following it, we just need to prove that 



1 



1 

2' 



! < 2- Kp i n « 



P ( [w' a (V D ' u ,S,l)> v 

is bounded below by a positive constant for D large enough. If we define by 

Vjf>=M{t>7? : P t D ^Vn, 
then the expression in lj3lj) is equal to 

P C [ <(7> D '", 6, 1) > r? ; 7f D M = or ^ = p£ u 

L T l v l '1 °2 

+P< k(^' u , 5, 1) > »? ; P 1 !;" 7^ ^ ; ± V% 

L r l v l v l °2 

The first term in 11321) is nonnegative, and if we are in the conditions given by the second term, then 



(31) 



3^4 <* 



r? < s -, v? F i n„] (32) 



-.D.u 



1 



1 



f <-, 7^n r , 



^",^)>e, d(^,^5 U )> e and <7?-7f <-, 
implying that w' a (V D , J, 1) > e > rj. Therefore, the second term in ([32)1 is equal to 



1 tP r I v p j ' vP i of 



1 T p i vp > y 



t? < -, p£ £ n„ 
i<Tf <i 7^n n ](i 



(D). 



Now, by the strong Markov property applied to V at time ij , we have 



P rf ; ^i D ^ V °? ' 3- 1 < 2' r ° 



P„» ft V?f ; ^ V°f, ] I {1/3 < rf < 1/2 



(33) 
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Since we assumed that condition (i) did not hold, there exists rj G x(II n )\n n such that P r) [£/f < of] > C\ 
for a constant Ci > and D large enough. As 77 E x(n„), we can choose C such that P[x(C) = 77] > 
(and P^[(T2 < 00] > 0). Now, since we assumed that P[7 M 7^ 7 n ] > for all 7 ^ n„, the probability that 
a coalescence event occurs before a scattering event in the structured genealogical process £ started at 
any value not in H n is greater than a constant C 2 . Therefore, we can write 

for a constant C[ > 0. By the distribution of the epochs of the geographical collisions, the convergence in 
law of (rf ,V^ D ) to (o 2 , x(0) (cf- Lemma l3"TTjl and the fact that 77 ^ n„, we have for ( chosen as above 



>c 3 



for a constant C3 > and D large enough, so the expression in the right-hand side of lf33|) is bounded 
below by C' X C 3 , and so is lf3lj) . Hence, (u) => (i). 

Suppose now that condition (i) is fulfilled. Condition (a) of Corollary 3.7.4 in Ethier and Kurtz (.1986) 
trivially holds, so we only need to check condition (b) on the modulus of continuity. Fix ( € and 
a > 0, and let T > a and 77 > 0. Firstly, by the convergence in probability of erf to 0, there exists Di E N 
such that for all D > D\ , [of > a] < | , Secondly, we have 

P^ [at least one Rd~ event in [0, T]] < 1 — exp ( — T max cr d (£)) — > 0, D — > 00, 

so there exists D 2 > 1 such that for all D > D 2 , the previous quantity is less than |. Thirdly, by the same 
argument as in the beginning of the proof of Theorem ll.lt there exists N G N such that P^[oat < T] < ?. 
Hence, by Lemma GlU there exists D 3 > 1 such that for all D > D 3 , P([of < T] < §. 
Consequently, we can write for each D > max{Di, D 2 , -D3} and all 5 > 

P ( K(? D '",*,T) > 77] <P c [of > a] + P c [of < T] + P c [at least one R D -event in [0,T]] 

+P c [7i40P D '",c5,T) > 77; of < a; of > T; no i? D -events in [0, T]] 

< ^ + P c [w' a (V D ' u ,6, T) > 77; of < a; of > T; no i? D -events in [0, T]] . 

Furthermore, there exists 5 > such that P 7 [o2 < 3<5] < j^jU for all 7 G n„. Now, for all z G {1, . . . , N}, 
by the strong Markov property applied to V D at time tP_ x and the convergence of P 7 [Tf < 3(5] to 
P 7 [o2 < 3(5], uniformly in 7, we have 

Pclr^ < 00; T P r£ x < 3(5] = E c [l {rf _ i<oo} P^ p ^ [ff < 35]] < JjL 

for Z? large enough. Therefore, 

Pc [ w' a (V D < u ,8,T) > 77; of < a; of < T; no R D -event in [0,T]] 

AT 



»=i 



+Pc K(P D!tl ,(5,r) > 77; of < a; of < T; no fl^-event in [0,T]; rf - rP_ x > 3(5 for all % < N 
s.t. r^ x < 00] (34) 

and the first sum is less than |. To finish, let Vf denote the epoch of the next event after rf if V D D £ II n 

(if it exists, VP = +00 otherwise), and set VP = tP = crP +1 \iV D D G H n . Since we assume that condition 

(i) holds, for alH G {1, . . . , N} we have by the strong Markov property applied at time rf and the fact 
that the distribution of V D p concentrates on x(n„) as D grows to infinity by Lemma [BTTj 

P c [rf < 00; V? < <r? +1 ] ^0, D oo, 
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so the last term in lj34|l is less than 



N 

£P C [rf <oo;V/><af +1 ] 
i=i 

+P c [w' a (T D ' u ,5,T) > 17; erf < a; erf > T; no i? D -event in [0,T]; if - if_ x > 38 and 
= <y? + i for alH < N s.t. t?_ x < 00] , 

where the first sum is less than | for D large enough. But on that last event, erf is less than a and no 
i?D-events occur so rf is the epoch of the event directly after erf, then all geographical collisions are at 
least 3(5 away from each other and the erf's are the only times in between at which an event occurs, so 
necessarily w' a (V D , 8,T) = 0. Assembling all the pieces, we obtain that 



P c [w' a (V D ' u ,6,T)> v ] < 



completing the proof of (i) (ii). 

If C G x(n n ) U II„, then we only need to show that (i) implies the tightness of (P D ' u )d>i on [0, 00). 
Let us directly use the same notation as in the last proof. In the last paragraph, we proved that with 
a high probability, there is no accumulations of jumps between the random time rf and T. Also, 
we can make P c [rf < 2a] as small as we want by adjusting a and taking D large enough, and the 
probability that at least one i?£>-event occurs is vanishingly small, so we are left with proving that, if 
8 is such that [w' a (V D ' u , 8, T) > rj\ < rj, rf > 2a and no i?_o-events occur between and T, then 
P([w'(P D ' u , 8', a) >rj] < 77, for some 8' e (0, 8). If C G n„, rf > 2a and no i?_o-events occur, then rf is 
the epoch of the first event occurring to V B so w'(T >D ' u , 8', a) — for all 8' G (0, 8 A a). If ( 6 x(n„) and 
the other conditions hold, then by condition (i) we have 

P ( pF = *?]-i, £^00, 

and furthermore P^I^f < a] — > 1, so with a probability tending to one as D grows to infinity, one event 
occurs between and a, then nothing happens between a and 2a (there is no i?£>-events, so the next 
event after erf must occur at time rf > 2a) and the condition on the modulus of continuity is fulfilled 
after time a so, for any 8' G (0, 8 A a), we do have 

P c [w'(P D ^ 1 8' 1 a)> V ]<r 1 . 

This completes the proof of th e case ( G Y(n„,) U 

Now, by Theorem 3.7.8 in Ethier and Kurt d 1 1986t ). the two ingredients to obtain the convergence 



of the processes (V d )d>i are tightness, given by the first part of Proposition 13.31 for any a > 0, and 
convergence of the finite-dimensional distributions, given by Theorem ll.ll and the bijective correspondence 
between II„ and P„. For £ G II„ and a — 0, tightness still holds by virtue of the last paragraph, and 
an easy modification (namely allowing t — in the proof of the convergence of the one-dimensional 
distributions) of the proof of Theorem ll . 1 1 in that case, where £ = ( and erf = a.s., gives the convergence 
of the finite-dimensional distributions of V B , including at time t = 0. □ 

Let us briefly comment on the condition P^C/f = erf] 1, If the fast within-deme coalescence is 
given by a S-coalescent (including Kingman's coalescent) occurring in one deme at a time, the condition 
is fulfilled if and only if at most two lineages can be collected into the same deme during a single event. 
Indeed, in that case the next step of the genealogical process is either to scatter these two lineages into 
two different demes or to merge them into one lineage, the outcome of which is always in n„. If more 
than 2 lineages are gathered in the same deme and do not merge during the geographical collision, then 
with a positive probability only two of them are involved in the next genealogical event and at least two 
rapid steps are needed for V D to re-enter II„. The same conclusion holds if two pairs of lineages are 
gathered in two demes (meaning 2 lineages per deme) , since the genealogical process acts in one deme at 
a time by assumption. 
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4 Collapse of structured genealogical processes 



The next proposition states that the only reasonable structured genealogies which collapse to an unstruc- 
tured genealogy (given by a S-coalescent) when the number of denies tends to infinity are the genealogies 
that we have described before, subject to certain conditions. 

Note that if we want the lineages to be exchangeable in the limit, the limiting process needs to take 
its values in \J n>1 n„. Indeed, since the rates of intra- and inter-demes mergers greatly differ, we should 
observe only inter-demes events on the slow time scale. This requires that each deme contains at most 
one lineage at any given time in the limit. 

Proposition 4.1. Let (Vf , t > 0)d>i be a sequence of structured genealogical processes with values in 
1J J1>1 P^. Then the following are equivalent 

1. There exists a sequence rp such that rjj — > +oo as D — > oo and two structured genealogical processes, 
(£,t,t > 0) (resp. (Pt,t >0)) with values in \J n>1 P^ (resp. U n>1 II n J satisfying 

(a) for each n € N, the sequence of structured genealogical processes (V D - X ., t > 0)d>i on the fast 

time scale, with initial value in P*, converges to £ as a process in Dp» [0, oo). In addition, £ 
is consistent in the sense of Lemma ] 2.2k 

(b) the sequence (V^ \t > 1)d>x on the slow time scale converges towards V in that the finite- 
dimensional distributions (except possibly at time 0) converge as in Theorem li.il for every 
sample size n; 

(c) there exists a "Z-coalescent {Rt,t > 1) such that for all n > 1, the unstructured genealogical 
process V u induced by V\n n has the law of the restriction of R to P„. 

2. The rates associated to V B satisfy conditions (Qp, and ^) of Section 3.1, and condition (i) of 
Lemma[KM holds. 

We shall see in the proof that the consistency of £ is a key ingredient to obtain the desired equivalence. 
In fact, if we did not impose it, it would certainly be possible to construct particular examples in which 
the unstructured genealogy on the slow time scale is also a S-coalescent, but the genealogies within a 
deme are not consistent. We would need to impose 'good' values for the corresponding rates. We rather 
chose here to emphasize more biologically relevant models, for which the within-deme genealogical process 
is also consistent and which can be described as part of an entire class of models rather than special cases. 

Proof. The implication 2 =>■ 1 in a consequence of Theorem Proposition 13.11 and Proposition 12.11 

Let us prove that 1 =>■ 2. From the definition of a structured genealogical process, blocks can only 
move and coalesce. Furthermore V D stays in P^ whenever its initial value lies in this set, so we just 
need to fix n > and look at the corresponding rates of scattering, gathering and coalescence. From the 
description of the limiting processes £ and V, the rates of V D must be of the form 

where for i e {1,2}, (r)\Q — > P^ l \v\C) as D tends to infinity. (To simplify notation, we shall write 

Pd (^10 = P^ (vlC) ■) Thus, p^(r]\() are the rates associated to the generator ^ of the process £. Let us 
check that all cited conditions necessarily hold: 

• If £ — > i] is a 1-event, then by adding an (n + l)-st individual in one of the existing blocks (therefore 
changing the sizes of the blocks but not their number) , we see that the consistency of £ imposes that 
the part of the rate corresponding to the fast time scale depends neither on n, nor on the sizes of the 
blocks. By exchangeability of the demes, this rate is thus characterized by the number of lineages 
present in each deme before and after the transition, the order of these numbers being irrelevant. 
Therefore, condition (1) holds. By Lemma l2~2l the consistency of £ implies that condition (i) of 
Lemma [221 is also satisfied. 
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• Once again by consistency of £, the rate of a 2-event must be of order 1. Indeed, it may otherwise 
lead to an additional 1-event for the restriction of the process with the (n + l)-st lineage (if this 
additional lineage lands in a non-empty deme or in the same deme as another moving lineage 
coming from a different subpopulation , and the other dispersing lineages land in different demes) , 
or involve at least two lineages alone in their demes on the fast time scale. If such an event was 
allowed, then by exchangeability of the islands the fast dynamic could act on a structured partition 
in II n and merge two lineages starting from different demes. Again by exchangeability, any pair 
of lineages could merge on the fast time scale and so the outcome of £ would be a single lineage 
with probability one, a trivial situation which is of no interest here. Now, since we want to keep 
exchangeability of the lineages in the unstructured genealogy (on the slow time scale), the rates 
of 2-events should depend only on the number of lineages and their geographical distribution (and 
possibly on n). But if ( € H n , all lineages are in different demes, so the corresponding rates are 
necessarily of the form given in condition (2). If the rates were to depend on n, then as the rates of 
the fast genealogical process which follows directly (for D large enough, as in the proof of Theorem 
II. lj) are independent of n, the overall transition from r\ G n„ to the value of V D when it reenters H n 
would eventually give different rates for V acting on n„ and for the restriction to H n of V acting 
on n n+ i (recall the convergence of T^_ t and uf towards <n to see that the transitions of V actually 
can be described as in Section 3.2). This would contradict the fact that the process V u corresponds 
to a S-coalescent. Finally, we obtain that condition (2) must hold. 

• The last argument imposes also that geographical collisions involving k lineages occur at a rate 
which is the sum of all corresponding geographical events involving k + 1 lineages, which is exactly 
writing the consistency equations (J4j) of condition (3) . 

Finally, we obtain that 2 => 1. □ 



5 Example 

We now turn our attention to a particular class of metapopulation models which combine a (finite) 
A-coalescent within demes with migration between demes and sporadic mass extinction events. We 
will use the results derived in the preceding sections to characterize the form that the genealogy takes 
in the infinitely many demes limit. This, in turn, will allow us to illustrate how the statistics of the 
population- wide S-coalescent depend on the interplay between extinction/recolonization events and the 
local demographic processes occurring within demes. While these models are quite contrived - in partic- 
ular, we have simply imposed the condition that a small number of demes is responsible for repopulating 
vacant demes following a mass extinction - they will allow us to explicitly calculate some quantities of 
interest. 

We first describe how the population evolves forwards-in-time. Suppose that for each D, each deme 
contains exactly N individuals. Fix K £ N, and let A d (dx) and A 9 (dy) be two probability measures 
on [0,1] with no atom at 0. Then reproduction, migration, and extinction/recolonization events occur 
according to the following rules. 

• Each individual in each deme reproduces at rate D according to the following scheme. If an 
individual in deme i reproduces, then a number x is sampled from [0, 1] according to the probability 
distribution A d (dx), and then each occupant of that deme dies with probability x and is replaced 
by an offspring of the reproducing individual. In terms of the notation of Section 12.11 such an 
event has the following representation when k is the label of the reproducing individual. First, 
R?>i' = (0, . . . , 0) for all pairs of integers j ± f G [D] and W> ' = (1, . . . , 1) if j G [D] \ {i}, i? M 
is a random vector obtained by choosing a number x according to A d (dx), a number m according 
to a binomial distribution with parameters (N, x) , and finally a set O C [N] of offspring of the 
reproducing individual by sampling m labels in [N] uniformly without replacement. Then, R 1 ^ 1 = m, 
Rtf = for all k' G O \ {k}, and R l { 1 = 1 for all l£ O U {k}. 

• At rate Dm\ , each individual gives birth to a single migrant offspring which then moves to any one 
of the D demes, chosen uniformly at random, and replaces one of the N individuals within that 
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deme, also uniformly at random. In this case, if j is the label of the deme containing the parent and 
k is its label, then a pair is sampled uniformly in [D] x [TV] and the vectors R are as described 
in Example [3] of Section 12.11 

• Mass extinction events occur at rate e. When such an event occurs, a number y is sampled from [0, 1] 
according to the probability distribution A a (dy). Then, each deme goes extinct with probability y, 
independently of all the others, and is unaffected by the extinction otherwise. Simultaneously, K 
of the D demes are chosen uniformly at random to be source demes, and the deceased occupants 
of the extinct island are replaced by offspring produced by individuals living in the source demes 
according to the following scheme. The parent of each individual recolonizing a deme left vacant 
by the mass extinction is chosen uniformly at random and with replacement from among the NK 
inhabitants of the source demes. If a source deme is chosen from among the extinct ones, then 
the parents of the offspring emerging from that deme are the individuals that occupied the deme 
immediately prior to the extinction. To describe such an event using the notation of Section 12. 1\ 
suppose that a number y is chosen according to A 9 (dy), a number m is sampled according to a 
Binom(£>, y)-distribution and a (random) set O cx t C [D] is constructed by sampling uniformly 
without replacement m deme labels. Independently, another set O roc of K recolonizing demes is 
also chosen by uniform sampling. Then, for all i £ cx t we have i?" = (1, . . . , 1) and each deme 
j E 0Xt \ dec satisfies W'^> = (0, ...,0) for all i e [D]. The vectors with j <E O lec and 
i € Ocxt U O icc are not easily formulated explicitly (in particular, their description depends on 
whether the recolonizing demes also go extinct during the event), but it is clear that the evolution 
of the population satisfies the two conditions required in Section 12.11 

Suppose that n individuals are sampled from the population at time 0, and let us consider the evolution 
(backwards- in-time) of the structured coalescent process V D in P*. From the description of the model 
forwards-in-time, the events affecting the genealogy occur at the following rates: 

1. If a deme contains b lineages, then each fc-tuple of lineages in this deme (for k < b) merges into one 
lineage in the same deme at rate 



Furthermore, any merger event occurs in one deme at a time. 

2. Each lineage migrates (alone) at rate Dm\. Indeed, the total rate at which migrant offspring are 
produced forwards- in-time is ND x Dmi, but the probability that such a migrant belongs to the 
lineage under consideration is (ND) -1 (recall that the deme and the label of the individual replaced 
by the migrant are chosen uniformly at random). Consequently, the probability that a migrating 
lineage lands in a non-empty deme is D~ x times the number of demes occupied by the other lineages 
oiV^L. When such an event occurs, the probability that the migrating lineage also merges with an 
ancestral lineage present in the source deme is iV~ x times the number of distinct ancestral lineages 
present in that deme. 

3. Extinction events generate geographical collisions at rate O(l). Because the K recolonizing demes 
are chosen uniformly from among the D islands, recolonization by a deme containing at least one 
lineage of the genealogical process occurs with a probability of order 0(D~ 1 ), and so these events are 
negligible in the limit. Suppose that T®_ £ II„. Let k < \V^L\, r < K, and let fei, . . . , k r be integers 
greater than 1 and summing to k. For each i € {1, . . . , r}, let Li — {In, . . . , be a collection of % 
integers summing to h. Then each (\V^_\; kx, . . . , kr, 1, • • . , 1; L\, . . . , L r , {1}, . . . , {l})-geographical 
collision occurs at rate 
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\V t D _\-k 



ej\°(dy) J2 hs<K-r } ^-\ *) 



V y > {K-r-s)\K k +» 



= \ 9 4- n( — 

- A \Vf_ |;fei,...,fe r .,l,...,l;il,-,^r,{l},-.{l} Id 

The rate expression that appears in Eq. (35) can be interpreted in the following way. As well as the 
k ancestral lineages that are known to be affected by the disturbance (this is specified by the type of 
event), an additional s lineages may be caught up in the extinction event and moved to demes where 
they remain isolated (hence producing no changes in the structured genealogy). In (|35| . the first part in 
each term of the sum corresponds to the number of choices for these s additional lineages, followed by the 
probability that only these k + s lineages are affected. The condition r + s < K is imposed by the fact 
that the r groups of lineages geographically gathered and the s lineages affected but remaining alone in 
their demes must then belong to r + s distinct recolonizing demes. The middle part of the term specifies 
the probability that the affected lineages are grouped in the desired way: regardless of the labels of the 
recolonizing demes, if the latter contain no lineages of the sample just before the extinction then h ' 



(K-r-s)\ 

is the number of (unordered) ways of choosing r + s of them to receive the affected lineages, while K~ k ~ s 
is the probability that each of the k + s lineages moves to the prescribed recolonizing deme. Finally, 
the last product is obtained in a similar manner by allocating as many distinct ancestors as required to 
the groups of lineages gathered into the same demes. As explained above, the 0{D~ 1 ) remainder term 
accounts for the probability that at least one of the finitely-many recolonizing demes contains a lineage 
ofV t D _. 

Let us say that a simple collision occurs when a single lineage moves into a non-empty deme, and 
possibly merges with one of the lineages present in this deme. To verify that the convergence results from 
the previous sections apply to the example, it will be convenient to introduce the following quantities, 
defined for all (, )] G P*: 



and likewise 

<l>nc((,V) = 



1 if £ — > rj is a simple collision with coalescence, 
otherwise, 

1 if £ — > rj is a simple collision without coalescence, 
otherwise. 

By 'with coalescence' (resp. 'without coalescence'), we mean that the migrating lineage merges (resp. 
does not merge) with a lineage in the source deme during the same event. 

Let us consider a particular 1-event. If this event involves a single lineage moving to an empty deme, 
it may be caused either by a migration event of the kind described in item 2 above (which occurs at rate 
Dm\{l — k/D) if k is the number of demes occupied by the other lineages at the time of the event), or by 
a mass extinction event (whose rate is of order O(l) according to item 3). Consequently, the overall rate 
of any 1-event is of the form Dmi + 0(1). Groups of more than one lineage can also move simultaneously, 
but only through an extinction event and so at a rate of order O(l). If the event involves an intra-deme 
merger, then its rate is easily written in the form given in Section [3711 with rp = D; see item 1. A 2-event 
C — > T) occurs at a rate of order 0(1), and in particular if ( 6ll„, then this rate is given by 

A^C,J 7 ) + 2m 1 {^(C, J ?)i +<MC,'7)^r} + = X9 (C,v)+0(^), 

where A 9 (£,77) is the rate of the unique extinction event which turns C into r\. In this expression, the term 
in brackets is nonzero only if the event is a simple collision involving two lineages that have been collected 
in the same deme through migration. Such collisions occur at rate 2m\, and then the two lineages either 
coalesce, with probability iV _1 , or remain distinct, with probability 1 — A^ 1 . Finally, we must check 
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that the A 9 's satisfy (@]), and that the rates on the fast time scale satisfy condition (i) of Lemma [2~2l The 
latter condition follows from the description of the rates and the consistency of A-coalescents, and the 
validity of the former condition can be deduced from the fact that lineages choose independently of each 
other whether they are involved in the event, and which of the NK individual they take as a parent. We 
leave the straightforward but tedious calculations to the interested reader. 

All conditions of Theorem 11.11 and Proposition 12.11 hold. Thus, we can conclude that the finite 
dimensional distributions of V D converge to those of a structured genealogical process V with values in 
n n , and that the unstructured process V u is a S-coalescent with values in P n . Let us describe V u as 
precisely as we can. To apply the results of Section 12.31 we need to know the distribution of the final 
state of the 'fast' process £ that was introduced in Section 12.21 Starting from a structured partition 
where all blocks are contained in the same component (i.e., all lineages lie initially in the same deme), 
this distribution coincides with the sampling distribution of the infinitely many alleles model of the 
generalized Fleming- Viot process dual to the A-coalescent with finite measure x 2 A d (dx) acting within 
this deme. Indeed, on the fast time scale, ancestral lineages belonging to a common deme migrate out to 
distinct, empty islands, a process an alogous to mu tation to unique types with a 'mutation' rate equal to 
mi. Recursion formulae are given in iMohld |2006) which can be used to compute the probability p(n) of 
unordered allele configurations n = {m, . . . , nk} in the infinitely many alleles model when the genealogy 
is given by a A- or a S-coalescent. In our case, the formula of interest is (with p(l) = 1): 

k -. n — 1 k 

nmx ^ 1 ^ g n ,n-i v^tt n o~ l 1 \ 

q n + nmi k ' q n + nmi 3 n — % 

j=i t=i J j=i 

where n = Ylj=i n j — = (n>i, ■ ■ ■ , %-i, nj+i, ■ ■ ■ , 6j denotes the j'th unit vector in R fc and g n k 
(resp. g n ) is the rate at which the number of lineages decreases from n to k (resp. the total rate at which 
the number of lineages changes when n lineages are alive) , given by 



9nk 



(fc - 1) / Ad ( dx > n ~ k+1 i 1 - x ) k ^ 



and 

9r> 



n-l „i 

= J29nk= / A d (dx)(l~(l~x) n - 1 (l-x + nx)). 



These expressions are related to the distribution of C by the following formula: 

P[C= ({i?i},...,{i? fc },0,...,0)] =p(|fli|,... ) |B fc |), 

where ( = ({{1}, • ■ • , {"■}}, 0, • • • , 0) and \Bi\ denotes the number of elements in the block Bi. Indeed, 
because the dynamics on the fast time scale of lineages occupying different demes are independent, the 
final state of the fast genealogical process is the concatenation of all the final states of the groups of 
lineages starting in the same deme. Hence, the preceding results are sufficient to describe £ for any 
C € Pfj. Unfortunately, with this level of generality, there does not appear to be a simple description of 
the measure S associated to V u , but the rate associated to its Kingman part (that is its mass at 0) is 
given by: 

1 N - 1 , N mi f fn A d (dx)x 2 ) , N 

2mi— + 2mi— — p 2 =2-f 1 + (N-l ) —r^ — }■ 36 

N N N I Kd {dx)x 2 + 2mi J 

The first term in f36|) corresponds to a simple collision with coalescence, and the second term to a simple 
collision without coalescence; the probability that the lineages then coalesce before one of them migrates 
is given by p(2). 

One case which can be characterized more thoroughly is when dispersal between demes only occurs 
during extinction-recolonization events (mi = 0). For example, this might be a reasonable approximation 
to make when modeling a population in which migrants are at a substantial competitive disadvantage 
relative to residents, so that dispersal is only successful into demes in which the resident population has 
gone extinct. In this case, the Kingman component of the genealogy disappears (see ([361) 1. Furthermore, 
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viewed backwards in time, lineages gathered into common demes by mass extinction events cannot migrate 
away before the rapid within-deme coalescent reaches a common ancestor, and so any such group of 
lineages merges instantaneously into a single lineage. The shape of the resulting global coalescent therefore 
is determined only by the way in which mass extinction events gather lineages together. Recall the 
expression for the rates of geographical collisions given in (|35|) . and let us examine how K, the number 
of demes contributing colonists in the wake of a mass extinction, affects the shape of the genealogy. 

If K = 1, all lineages affected by a mass extinction event have parents within the same deme. The 
resulting genealogy is a A-coalescent, and the rate at which k ancestral lineages merge when m are present 
is equal to the rate at which exactly k lineages are caught up in an extinction event when m demes contain 
one lineage, that is 

c. I A°(dy)y k (l-y) m - k . 



o 

On the other hand, if we let K tend to infinity, then each term in the sum in (|35|) is asymptotically 
equivalent to rj^^~y. K~ k ~ s ~ K r ~ k , up to a constant (recall that the sample size n is finite and 
bounds the number of lineages at any times). Consequently, binary geographical collisions (k = fci = 2, 
r = 1, j\ e {1,2}) occur at a rate of order 0(i^ _1 ), whereas the rate of a collision involving at least 
3 lineages is of order at most 0(K~ 2 ). Hence, for fixed sample size n, the probability that only binary 
mergers occur in the sample genealogy approaches 1 as K tends to infinity, and the rate of each binary 
merger (multiplied by K) converges to 

e / A'idytf, (37) 
Jo 

where the term y 2 is obtained by observing that the condition s < K — 1 in (|35|) is always fulfilled for 

n fixed and K large enough, and that ^!=o~ ^ ^ Vt ^}~ 2 )v 2+S {^ ~ y)\ v t-\- 2 - s = y 2 _ Once the lineages 
are gathered into the same deme, they can only coalesce and they do so instantaneously on the slow 
time scale as D — + oo. It follows that if time is rescaled by a factor of DK, then the rate of a binary 
merger converges to that of Kingman's coalescent run at the rate shown in f37j) . Moreover, under this 
time rescaling, the rates of the finitely many possible multiple merger events converge to as K grows 
to infinity, and so the limiting (as D — > oo) unstructured genealogical process V u corresponding to an 
evolution with K recolonizing demes converges to Kingman's coalescent as a process in Dp n [0, oo) as K 
tends to infinity. (Note, however, that this does not imply that one can interchange the limits D — > oo 
and K — » oo.) Finally, if K is finite but greater than 1, then geographical collisions involving more than 
two lineages occur at a non-negligible rate, and so the resulting unstructured genealogy is a more general 
S-coalescent. 

This example shows that a large class of coalescent processes can arise in the infinitely many demes 
limit of a subdivided population with sporadic mass extinctions. Depending on both the migration and the 
extinction rates, as well as on the number of demes contributing to population recovery following a mass 
ext inction, the lim iting genealogical process can range from Kingman's coalescent (K — oo), as derived 
by IWakeley 1 2004 ). to a A-coalescent (K = 1, mi = 0), with a family of S-coalescents interpolating 



between these two extremes. In this particular class of models, multiple mergers of ancestral lineages are 
more likely to occur when all three parameters, K, N and mi, are small, so that mass extinctions have 
a non-negligible probability of gathering lineages into a common deme where they undergo a series of 
rapid mergers before being scattered again by migration. This observation suggests that it is a generic 
property of structured population models that if the limiting coalescent admits any multiple mergers, 
then it also admits simultaneous mergers. 
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